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(54) TiUe: NON-ENDOGENOUS. CONSTITUTIVELY ACTIVATED HUMAN G PROTEIN-COUPLED RECEPTORS 
(57) Abstract 

Disclosed herein are constitutively activated, non-endogenous versions of endogenous human G protein-<oupled receptors comprising 
(a) the following amino acid sequence region (C-terminus to N-teiminus orientation) and'or (b) the following nucleic acid sequence 
region (3* to 5*oncntation) transversing the transmembranc-6 (TM6) and intracellular loop-3 (IC3) regions of the GPCR: (a) P' AA15 
X and/or (b) P«won (AA-codon)i5 Xcodon, respectively. In a most preferred embodiment, P» and pco^o" are endogenous proline and an 
endogenous nucleic acid encoding region encoding proline, respectively, located widiin TM6 of the non-endogenous GPCR; A A 15 and 
(AA-codon)i5 are 15 endogenous amino acid residues and 15 codons encoding endogenous amino acid residues, respectively; and X and 
Xcodon are non-endogenous lysine and a non-endogenous nucleic acid encoding region encoding lysine, respectively, located within IC3 
of die non-endogenous GPCR. Because it is most preferred that the non-endogenous human GPCRs which incorporate these mutations 
are incorporated into mammalian cells and utilized for the screening of the candidate compounds, the non-endogenous human GPCR 
incorporating the mutation need not be purified and isolated per se (i.e.. these are incorporated widiin the cellular membrane of a mammalian 
cell), although such purified and isolated non-endogenous human GPCRs are well within the purview of this disclosure. 
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NON-ENDOGENOUS, CONSTITUTIVELY ACTIVATED 
HUMAN G PROTEIN-COUPLED RECEPTORS 

The benefits of commonly owned U.S. Serial Number 09/170,496, filed 
October 13, 1998, U.S. Serial Number 08/839, 449 filed April 14, 1997 (now abandoned), 
5 U.S. Serial Number 09/060,188, filed April 14, 1998; U.S. Provisional Number 60/090,783, 
filed June 26, 1998; and U.S. Provisional Number 60/095,677, filed on August 7, 1998, are 
hereby claimed. Each of the foregoing applications are incorporated by reference herein in 
their entirety. 

FIELD OF THE INVENTION 

0 The invention disclosed in this patent document relates to transmembrane 

receptors, and more particularly to human G protein-coupled receptors (GPCRs) which have 
been altered such that altered GPCRs are constitutively activated. Most preferably, the altered 
human GPCRs are used for the screening of therapeutic compounds. 
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BACKGROUND OF THE INVENTION 
Although a number of receptor classes exist in humans, by far the most abundant and 
therapeuticdlyrelevantisiepr«sentedbytheGprDteinKx,upledr«:q^^ 
It is estimated that there are some 100.000 genes within the human genome, and of these. 
5 approximately 2% or 2.000 genes, are estimated to code for GPCRs. Of these, there are 
approximately 100 GPCRs for which the endogenous Hgand that binds to the GPCR has been 
identified. Because ofthe significant time-kg that exists between the discovery of an endogeno^ 
GPCR and its endogenous ligand. it can be presumed that the remaining 1.900 GPCRs will be 

identified and characterized long before the endogenous ligands for theserecqjtorsareiden^^^ 
10 Indeed,therapiditybywhichtheHumanGenomeProjectissequencingthe 100.000 human genes 
indicates that the remaining human GPCRs will be fully sequenced within the next few years. 
Nevertheless, and despite the efforts to sequence the human genome, it is still very unclear as to 
how scientists will be able to rapidly, effectively and efficiently exploit this infom^tion to 
improve and enhance the human condition. TT,e present invention is geared towards this 
1 5 iii5X)rtant objective. 

Receptor, including GPCRs. for which the endogenous ligand has be«i identified are 
lefen^l to as "known" recq,tors. while receptors for which the endogenous hgand has not been 
identified are referred to as "orphan" receptors. Tlus distinction is not merely semantic, 
particularly in the case of GPCRs. GPCRs represent an important area for the development of 
20 phamiaceutical products: fiom approximately 20 of the 100 known GPCRs. 60% of all 
prescription pharmaceuticals have been developed lHus, the orphan GPCRs are to the 
pharmaceutical industry what gold was to California in the late 19* century - an opportunity to 
drive growth, expansion, enhancement and development A serious drawback exists, however. 
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with orphan receptors relative to the discovery of novel therapeutics. This is because the 
traditional approach to the discovery and development of pharmaceuticals has required access to 
both the receptor and its endogenous ligand. Thus, heretofore, oiphan GPCRs have presented the 
art with a tantalizing and undeveloped resource for the discovery of phannaceuticals. 

Under the traditional ^proach to the discovery of potential therapeutics, it is generally the 
case tiiat the receptor is first idaitified. Before drug discovery efforts can be initiated, elaborate, 
time consuming and expensive procedures are typically put into place in order to identify, isolate 
and generate the receptor's endogenous hgand this process can require from between 3 and ten 
years per receptor, at a cost of about $5million (U.S.) per receptor. These time and financial 
resources must be expended before the traditional approach to drug discovery can conmience. 
This is because traditional drug discovery techniques rely upon so-called "competitive binding 
assays" whereby putative therapeutic agents are "screened" against the receptor in an effort to 
discover compounds that either block die endogenous ligand jfrom binding to the receptor 
("antagonists"), or enhance or mimic the effects of the ligand binding to the receptor ("agonists"). 
The overall objective is to identify compounds that prevent cellular activation when the ligand 
binds to the receptor (the antagonists), or that enhance or increase cellular activity that would 
otherwise occur if the ligand was properly binding with the receptor (the agonists). Because the 
endogenous ligands for orphan GPCRs are by definition not identified, the ability to discover novel 
and unique therapeutics to these receptors using traditional drug discovery techniques is not 
possible. The present invention, as will be set forth in greater detail below, overcomes these and 
otiier severe limitations created by such traditional drug discovery techniques. 

GPCRs share a common structural motif. All these receptors have seven sequences of 
between 22 to 24 hydrophobic amino acids that form seven alpha helices, each of which spans the 
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membrane (each span is identified by number, ie., transmembrane-l (TM-1), transmebrane-2 
(TM-2), etc.). The transmembrane helices are joined by strands of amino acids between 
transmanbrane-2 and transmanbrane-3, transm«3nbrane-4 and transmembiane-5, and 
transmembrane-6 and transmanbrane-T on flie exterior, or "extracellular" side, of the cell 
5 membrane (these are referred to as "extracellular" regions 1, 2 and 3 (EC-1, EC-2 and EC-3), 
respectively). TTie transmembrane helices are also joined by strands of amino acids between 
tiansmembrane-1 and transmembrane-2, ttansmembrane-3 and transmembrane-4, and 
transmembrane-5 and transmembrane-6 on the interior, or "intracellular" side, of the cell 
membrane (tiiese are referred to as "intracellular" regions 1, 2 and 3 (IC-1, IC-2 and IC-3), 
1 0 respectively). The "carboxy" ("C") temiinus of the receptor lies in the intracellular space within 
the cell, and the "amino" ("N") teiminus of the receptor lies in the extraceUular space outside of 
the cell. The general structure of G protein-cotqiled receptors is depicted in Figure I. 

Generally, when an endogenous ligand binds with the receptor (often referred to as 
"activation" of the recq)tor), tiiere is a change in the conformation of the intracellular region tiiat 
1 5 allows for coupUng between the intracellular region and an intraceUular "G-protein." Altiiough 
other G proteins exist, currently. Gq, Gs. Gi, and Go are G proteins that have been identified. 
Endogenous ligand-activated GPCR coupling with tiie G-protein begins a signaling cascade 
process (referred to as "signal transduction"). Under normal conditions, signal transduction 
ultimately results in cellular activation or ceUular inhibition. It is thought that the IC-3 loop as 
20 well as tiie carboxy temiinus of the receptor interact with the G protein. A principal focus of tiiis 
invention is directed to the tninsmffliibrane-6 (TM6) r^on and the intiaccllular-3 (IC3) region of 
the GPCR. 

Underphysiological conditions, GPCRs exist in theceU membrane in equiUbriumbetween 
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two diflferent conformations: an "inactive" state and an "active" state. As shown schematically in 
Figure 2, a recqjtor in an inactive state is unable to link to the intracellular signaling transduction 
pathway to produce a biological response. Changing the receptor conformation to the active state 
allows linkage to the ti^ansductionpatiiway (viatheG-protein) and produces a biological response. 
5 A recqjtor may be stabilized in an active state by an oidogenous ligand or a compound 

such as a drug. Recent discoveries, including but not exclusively limited to modifications to the 
amino acid sequence of the receptor, provide means other than endogenous ligands or drugs to 
promote and stabilize the receptor in tiie active state conformation. These means effectively 
stabilize the receptor in an active state by simulating the effect of an aidogenous ligand binding 
10 to the receptor. Stabilization by such ligand-independent means is termed "constitutive iweptor 
activation." 

As noted above, the use of an orphan receptor for screening purposes has not been 
possible. This is because the traditional "dogma" regardingscreening ofcompounds mandates that 
the ligand for the receptor be known. By definition, then, this approach has no applicability with 

1 5 respect to orphan receptors. Thus, by adhering to this dogmatic approach to the discovery of 
Aer^eutics, the art, in essence, has taught and has been taught to forsake the use of orphan 
receptors unless and until the endogenous ligand for the receptor is discovered. Given that there 
are an estimated 2,000 G protein coupled receptors, the majority of which are orphan receptors, 
such dogma castigates a creative, unique and distinct q)proach to the discovery of therapeutics. 

20 Information regarding the nucleic acid and/or amino acid sequences of a variety of GPCRs 

is summarized below in Table A. Because an important focus of tiie invention disclosed herein 
is directed towards orphan GPCRs, many of the below-cited references are related to orphan 
GPCRs. However, this list is not intended to imply, nor is this list to be construed, legally or 
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otherwise, that the invention disclosed herein is only q)plicable to orphan GPCRs or the specific 
GPCRs listed below. Additionally, certain receptors that have been isolated are not the subject of 
pubhcations perse; for example, reference is made toaGProtein<:oupledRecept 
the "world-wide web" (neither the named inventors nor the assignee have any affiliation with this 
site) that fists GPCRs. Other GPCRs are the subject of patent applications owned by the present 
assignee and these arc not fisted below (including GPR3, GPR6 and GPR12; see U.S. Provisional 
Number 60/094879): 



Table A 



Receptor Name 


Publication Reference 


GPRl 


23 Genomics 609 (1994) 


GPR4 


14 DNA and Cell Biology 25 (1995) 


GPR5 


14 DNA and Cell Biology 25 (1995) 


GPR7 


28 Genomics 84 (1995) 


GPR8 


28 Genomics 84(1995) 


GPR9 


184 J. Exp. Med. 963 (1996) 


GPRIO 


29 Genomics 335 (1995) 


GPR15 


32 Genomics 462(1996) 


GPRl 7 


70 J Neurochem. 1357 (1998) 


GPRl 8 


42 Genomics 462 (1997) 


GPR20 


187 Gene 75(1997) 


GPR21 


187 Gene 75 (1997) 


GPR22 


187 Gene 75(1997) 


GPR24 


398 FEBS Lett. 253(1996) 


GPR30 ~ 


45 Genomics 607 (1997) 


GPR31 


42 Genomics 519 (1997) 


GPR32 


50 Genomics 281 (1997) 


GPR40 


239 Biochem. Biophys. 
Res. Comraim. 543 (1997) 


GPR41 


239 Biochem Biophys. 
Res. Commun. 543 (1997) 


GPR43 


239 Biochem. Biophys. 
Res, Commun. 543 (1997) 


APJ 


136 Gene 355 (1993) 


BLRl 


22 Eur. J. Immunol. 2759 (1992) 


CEPR 


231 Biochem. Biophys. 
Res. Commun, 651 (1997) 


EBIl 


23 Genomics 643 (1994) 


EBI2 


67 J. Virol. 2209(1993) 


ETBR-LP2 


424 FEBS Lett. 193(1998) 


GPCR-CNS 


54 Brain Res. Mol. Brain Res. 152 (1998); 
45 Genomics 68 (1997) 


GPR-NGA 


394 FEBS Lett. 325(1996) 


H9 


386 FEBS Un 219 (1996) 
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UBA954 



1261 Biochim. Biophys. Acta 121 (1995) 



HG38 


247 Biochem. Biophys. 
Res. Commun. 266 (1998) 


HM74 


5 Int. Immunol. 1239 (1993) 


OGRl 


35 Genomics 397 (1996) 


V28 


163 Gene 295 (1995) 



As will be set forth and disclosed in greater detail below, utilization of a mutational cassette to 
modify the aidogenous sequence of a human GPCR leads to a constitutively activated version of 
the human GPCR. These non-endogenous, constitutively activated versions ofhuman GPCRs can 
be utilized, inter alia, for the screening of candidate compounds to directly identify compounds 
10 of, eg:, ther^eutic relevance. 



SUMMARY OF THE INVENTION 

Disclosed herein is a non-endogenous, human G protein-coupled receptor comprising 
(a) as a most preferred amino acid sequence region (C-tenninus to N-terminus orientation) 
and/or (b) as a most preferred nucleic acid sequence region (3* to 5' orientation) transversing 
15 the transmembrane-6 (TM6) and intracellular loop-3 (ICS) regions of the GPCR: 
(a) P> AA,5 X 

wherein: 

(1) P4s an amino acid residue located within the TM6 region of 
the GPCR, where P* is selected torn the group consisting of (i) 
the endogenous GPCR's proline residue, and (ii) a non- 
endogenous amino acid residue other than proline; 

(2) AA,5 are 1 5 amino acids selected from the group consisting of 



20 
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(a) the endogenous GPCR's amino acids (b) non-endogenous 
amino acid residues, and (c) a combination of the endogenous 
GPCR's amino acids and non-endogenous amino acids, 
excepting that none of the 1 5 endogenous amino acid residues 
that are positioned within the TM6 region of the GPCR is 
proline; and 

(3) X is a non-endogenous amino acid residue located within the 
ICS region of said GPCR, preferably selected from the group 
consisting of lysine, hisitidine and arginine, and most 
preferably lysine, excepting that when the endogenous amino 
acid at position X is lysine, then X is an amino acid other than 
lysine, preferably alanine; 



and/or 



(b)pco*m(AA-COdon),5X..^ 

15 wherein: 



20 



(1) P"*» is a nucleic acid sequence within the TM6 region of the 
GPCR, whae encodes an amino acid selected from the 
group consisting of (i) the endogenous GPCR's prohne residue, 
and (ii) a non-endogenous amino acid residue other than proline; 

(2) (AA-codon),j are 15 codons encoding 15 amino acids selected 
from the group consisting of (a) the endogenous GPCR's amino 
acids (b) non-aidogenous amino acid residues and (c) a 
combination of die endogenous GPCR's amino adds and non- 
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endogenous amino acids, excqiting that none of the IS 
endogenous codons within the TM6 region of the GPCR encodes 
a proline amino acid residue; and 
(3) X^on is a nucleic acid encoding region residue located within the 
5 IC3 region of said GPCR, where X^odon encodes a non-endogenous 

amino acid, preferably selected fiom the group consisting of 
lysine, hisitidine and arginine, and most preferably lysine, 
excepting that when the endogenous encoding region at position 
X«xton encodes the amino acid lysine, then Xcodon encodes an amino 
1 0 acid other than lysine, prefa:ably alanine. 

The terms endogenous and non-endogenous in reference to these sequence cassettes are relative 
to the endogmous GPCR. For example, once the endogenous proline residue is located within the 
TM6 region of a particular GPCR, and the 1 6* amino acid therefrom is identified for mutation to 
constitutively activate the receptor, it is also possible to mutate the endogenous proline i^idue 
1 5 (i.e., once the marker is located and the 1 6* amino acid to be mutated is identified, one may mutate 
the marker itselQ, although it is most preferred that the proline residue not be mutated. Similarly, 
and while it is most preferred that AA15 be maintained in their endogenous forms, these amino 
acids may also be mutated. The only amino acid that must be mutated in the non-endogenous 
version of the human GPCR is X ue., the endogenous amino acid that is 16 residues from P* 
20 cannot be maintained in its endogenous form and must be mutated, as further disclosed herein. 
Stated again, while it is preferred that in the non-endogenous version of the human GPCR, P* and 
AA15 remain in their endogenous forms identical to their wild-type forms), once X is 
identified and mutated, any and/or all of P* and AA15 can be mutated. This applies to the nucleic 
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acid sequences as well. In those cases where the endogenous amino acid at position X is lysine, 
then in the non-endogenous version of such GPCR. X is an amino acid other than lysine, 
preferably alanine. 

Accordingly, and as a hypothetical example, if the endogenous G?CR has the following 
5 endogenous amino acid sequence at the above-noted positions: 

P-AACCTTGGRRRDDDE -Q 
then any of the following exemplary and hypothetical cassettes would fall within the scope of 
the disclosure (non-endogenous amino acids are set forth in bold): 

P-AACCTTGGRRRDDDE -K 
P-AACCrrfflGRRDDDE-K 
P-ADEETTGGRRRDDDE -A 
P-LLKFMSTWZLVAAPQ -K 
A-LLKFMSTWZLVAAPQ -K 

It is also possible to addaminoacidrcsidues within AA,,,butsuchanapprt)achisnotpart^ 
15 advanced. Indeed, in themostprefeimi embodiments, the only amino acid that differs in the non- 
endogenous version of the human GPCR as compared with the endogenous version of that GPCR 
is the amino acid in position X; mutation of this amino acid itself leads to constitutive activation 
of the receptor. 

Thus, in particularly preferred embodiments. P' and P««^ are endogenous pro^^ 
20 endogenousnucldcacidencodingregionencodingproUne,respectivelr.andXandX«^ 
endogenous lysineoralanineandanon-endogenousnucldcacidencodingregionOT^^ 
or alanine, respectively, with lysine being most preferred. Because it is most preferred that the 
non-endogenous versions of the human GPCRs which incorporate these mutations are 
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incorporated into mammalian cells and utilized for the screaiing of candidate compounds, fee non- 
endogenous human GPCR incoiporating the mutation need not be purified and isolaiedperse (i.e.. 
these are incoiporated within the cellular membrane of a mammaUan cell), although such purified 
and isolated non-endogenous human GPCRs are well within the purview of this disclosure. Gene- 
5 targeted and transgenic non-human mammals (preferably rats and mice) incorporating the non- 
endogenous human GPCRs are also within the purview of this invention; in particular, gene- 
targeted mammals are most preferred in that these animals will incorporate the non-aidogenous 

versionsofthe human GPCRs inplaceofthenon-humanmammal'sendogenousGPCR-encoding 

region (techniques for generating such non-human mammals to replace the non-human mammal 's 
10 protein encoding region with a human encoding region are well known; see, for example, US. 

Patent No. 5,777,194.) 

It has been discovered that these changes to an endogenous human GPCR render the 

GPCR constitutively active such that, as will be further disclosed herein, the non-endogenous, 

constitutively activated version of the human GPCR can be utilized for, inter alia, the direct 
15 screening of candidate compounds without the need for the endogenous ligand. Thus, methods 

for using these mataials, and products identified by these methods are also within fee purview of 

fee following disclosure. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 shows a generalized stnicture of a G protein-coupled receptor wife fee numbers 
20 assigned to fee transmembrane heUxes, fee intracellular loc^s, and fee extraceUular loops. 

Figure 2 schematically shows fee two states, active and inactive, for a typical G 
protein coupled recqjtor and fee linkage of fee active state to fee second messenger 
transduction pafeway. 
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Rgurc 3 is a sequence diagram of the preferred vector pCMV, including restriction 
CTzymoi site locations. 

Figure4isadiagrainmatici^resffltationofthesignalmeasur«i comparing pClVn^ 

endogenous,constitutivelyactiveGPR30inhibitionofGPR6-mediatedactivationofCRE-Luc 
5 reporter with endogenous GPR30 inhibition of GPR6-mediated activation of CRE-Luc 
rqjorter. 

Figure5isadiagrammaticrepresentationofthesignalmeasui«icomparingpCMV,^^ 
endogenous, constitutively activated GPRl 7 inhibition of GPR3-mediated activation of CRE- 
Luc reporter with endogenous GPRl 7 inhibition of GPR3 -mediated activation of CRE-Luc 
10 rqwrter. 

Figure 6 provides diagrammatic results of the signal measured comparing control 
pCMV, endogenous APJ and non-endogoious APJ. 

Figure 7 provides an illustration of ff 3 production fiom non-endogenous human 5- 
HTm receptor as compared to the endogenous version of this receptor. 
1 5 Figure 8 are dot-blot fomiat results for GPRl (8A), GPR30 (8B) and APJ (8C). 

DETAILED DESCRIPTION 

Tliesdentificliteraturethathasevolvedaioundreceptorshasadoptedanumberofteims 
to refer to ligands having various effects on recqjtors. For clarity and consistency, the following 
definitions will be used throughout this patent document To the extent that these definitions 
20 conflict with other definitions for these tenns, the foUovwng definitions shall contiol: 

AGONISTS shaDmean compounds thatactivatetheintracellularresponsewhentiiey bind 
to the recq)tor, or enhance GTP binding to membranes. 
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AMINO ACID ABBREVIATIONS used herein are set below: 



ALANINE 


ALA 


A 


ARGININE 


ARC 


R 


ASPARAGINE 


ASN 


N 


ASP ARTIC ACID 


ASP 


D 


CYSTEINE 


CYS 


C 


GLUTAMIC ACID 


GLU 


E 


GLUTAMINE 


GLN 


Q 


GLYCINE 


GLY 


G 


HISTTDINE 


HIS 


H 


ISOLEUONE 


ILE 


I 


LEUCINE 


LEU 


L 


LYSINE 


LYS 


K 


METHIONINE 


MET 


M 


PHENYLALANINE 


PHE 


F 


PROLINE 


PRO 


P 


SERINE 


SER 


S 


THREONINE 


THR 


T 


TRYPTOPHAN 


TRP 


W 


TYROSINE 


TYR 


Y 


VALINE 


VAL 


V 



PARTIAL AGONISTS shall mean compounds which activate the intracellularrespcnse 
when they bind to the receptor to a lesser degree/extent than do agonists, or enhance GTP binding 
to membranes to a lesser degree/extent than do agonists 

ANTAGONIST shall mean compounds that competitively bind to the receptor at the 
same site as the agonists but which do not activate the intracellular response initiated by the active 
fomi of the recq)tor, and can thereby inhibit the intracellular response by agonists or partial 
agonists. ANTAGONISTS do not diminish the baseline intracellular response in the absence of 
an agonist or partial agonist. 

CANDIDATE COMPOUND shall mean a molecule (for example, and not limitation, 
a chemical compound) which is amenable to a screening technique. Preferably, the phrase 



wo 00/22129 



PCT/US99/23938 



- 14- 

"canddatecompound-doesnotincludecompoundswhichwerepubU^^^ 
selected from the group consisting of inverse agonist, agonist or antagonist to a receptor, as 
previously detennined by an indirect identification process ("indirectiy identified compound"); 
more preferably, not including an indirectly identified compound which has previously been 
5 determined to have therapeutic efficacy in at least one mammal; and, most preferably, not. 
including an indirectly identified compound which has previously been determined to have 
thenpeutic utility in humans. 

CODONshallmeanagroupingofthreenucleotides(orequivalentstonucleotides)which 
generally comprise a nucleoside (adenosine (A), guanosine (G). cytidine (C), uridine (U) and 

10 thymidine(T))coupledtoaphosphategroupandwhich,whentranslated,encodesanamh^ 

COMPOUND EFFICACY shaU mean a measurement of the ability of a compound to 
inhibit or stimulate receptor fimctionality, as opposed to receptor binding affinity. A preferred 

meansof detectingcompoundefficacy is viameasurementof;e.^..[33s]GTl^S binding, as fiff^^ 
disclosed in tiie Example section of tiiis patent (tocument. 

15 CONSTITUnVELY ACTIVATED RECEPTOR shall mean a receptor subject to 

constitutive receptor activation. In accordance with the invention disclosed herein, a non- 
aidogenous, human constitiitively activated G protein-coupled receptor is one that has been 
mutated to include the amino acid cassette P' AA^X as set forth in greater detail below. 

CONSTITimVERECEPTORACmATIONshallmeanstabilizationofare^^ 
20 in the active state by means other than binding of tiie receptor with its endogenous ligand or a 
chemical equivalent thereof. Preferably, a G piotehiK^upled receptor subjected to constitutive 
receptor activation in accordance witii tiie invention disclosed herein evidences at least a 10% 
difference in response (increase or decrease, as the case may be) to the signal measured for 
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constitutive activation as compared with the endogenous form of that GPCR, more preferably, 
about a 25% difference in such comparative response, and most preferably about a 50% difference 
in such comparative response. When used for the purposes of directly identifying candidate 
compounds, it is most preferred that the signal difference be at least about 50% such that there is 
5 a sufficient difference between the endogenous signal and the non-endogenous signal to 
differentiate between selected candidate compounds. In most instances, the "dififarace" will be 
an increase in signal; however, with respect to Gs-coupled GPCRS, the "difference" measured is 
preferably a decrease, as will be set forth m greater detail below. 

CONTACT or CONTACTING shall mean bringing at least two moieties together, 
10 whether in an in vitro system or an in vivo system. 

DIRECTLY IDENTIFYING or DIRECTLY IDENTIFIED, in relationship to the 
phrase "candidate compound", shall mean the screening of a candidate compound against a 
constitutively activated G protein-coupled receptor, and assessing the compound efficacy of such 
compound. This phrase is, under no circumstances, to be interpreted or undostood to be 
1 5 encompassed by or to encompass tfie phrase "indirectly identifying" or "indirectly identified." 

ENDOGENOUS shall mean a material that is naturally produced by the genome of the 
species. ENDOGENOUS in reference to, for example and not limitation, GPCR, shall mean that 
which is naturally produced by a human, an insect, a plant, a bacterium, or a virus. By contrast, 
the temi NON-ENDOGENOUS in this context shall mean that which is not naturally produced 
20 by the genome of a species. For example, and not limitation, a receptor which is not 
constitutively active in its endogenous form, but when mutated by using the cassettes disclosed 
herein and thereafter becomes constitutively active, is most preferably refenred to herein as a "non- 
endogenous, constitutively activated receptor." Both terms can be utilized to describe both "in 
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. W and "in vitro" systems. For example, and not limitation, in a saining ^p^oach, the 
endogenous or non-endogenous r«.eptor may be in reference to an in vitro screening system 
whereby the receptor is expressed on theceU-surfeceofamammalian cell. Asa 
and not limitation, where the genome of a mammal has been manipulat«l to include a non- 

5 ««^og-o-constitutivelyactivatedreceptor,sc.^ 
in vivo system is viable. 

HOST CELL shall meanaceUcapableofhavingaPlasmidand/orVectorincoTJo^^ 
therein. In thecaseofaprokaryotic Host aU.aPlasmid is typrcdly^^^^^^ 

molecule as theHostCellrepUcates(generally,thePlasmid is ther^i^^^^^^ 

10 i^toaeukaryoticHostCelOjinthecaseofaeukaryoticHostC^ 

celhdar DNA of the Host Cell such that when the eukaryotic Host Cell replicates, the Plasmid 

repHcales. Preferably, for the purposes of the invention disclosed herein, the Host Cell is 

eukaiyotic,morepreferably,mammaUan, and most preferably selected from thegrou^ 
of 293, 293T and COS-7 cells. 

15 ^«E<^YroENllFYINGorINDIRECIXYmEminEDme^ 

approach to the drug discovery process involving identification of an endogenous Ugand specific ■ 
for an endogenous receptor, screening of candidate compounds against the receptor for 
detemiination of those which interfere and/or compete with the ligand-receptor interaction, and 
assessing the efficacy of the compound for affecting at least one second messenger pathway 
20 associated with the activated receptor. 

INHIBIT or INHIBITING, in relationdrip to the temi "response" shall mean that a 

r«?K,nseisdecreasedorpreventedinthepresenceofacompoundasoppo^ 
the compound. 
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INVERSE AGONISTS shall mean compounds vvliich bind to either the endogenous fonn 
of the recqjtor or to the constitutively activated fomi of the receptor, and which inhibit the 
baseline intraceUular response initiated by the active form of the receptor below the normal base 
level of activity which is observed in the absence of agonists or partial agonists, or decrease GTP 
5 binding to membranes. Preferably, the baseline intracellular response is inhibited in the presence 
of the inverse agonist by at least 30%, more preferably by at least 50%, and most preferably by at 
least 75%, as compared with the baseline response in the absence of the inveree agonist. 

KNOWN RECEPTOR shall mean an endogenous recqjtor for which the endogenous 
ligand specific for that recqjtor has been identified. 
10 LIGAND shall mean an endogenous, naturally occurring molecule qjecific for an 

oidogenous, naturally occurring recqjtor. 

MUTANT or MUTATION in refwence to an endogerraus receptor's nucldc acid and/or 
amino acid sequence shall mean a specified change or changes to such endogenous sequences such 

tfiatamutated form of an endogenous, non-constitutively activated receptor evidences constitutive 
15 activation of the receptor. In terms of equivalents to specific sequences, a subsequent mutated 
form of a human receptor is considered to be equivalent to a first mutation of the human receptor 
if (a) the level of constitutive activation of the subsequent mutated form of the receptor is 
substantially the same as that evidenced by the first mutation of the recqjtor; and (b) the percent 
sequaice (amino acid and/or nucleic acid) homology between the subsequent mutated form of the 
20 receptor and the first mutation of the receptor is at least about 80%, more preferably at least about 
90% and most preferably at least 95%. Ideally, and owing to the fact that the most preferred 
cass^es disclosed herein for achieving constitutive activation includes a single amino acid and/or 
codon change between the endogenous and the non-endogenous fiarms of the GPCR (i.e. X or 
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X«fcJ, the poTcent sequaice homology should be at least 98%. 

OI«»HANI^CEI^ORshall mean an endogenous :«:eptorforwW^^ 
ligand specific for that receptor has not been identified or is not known. 

PHARMACEUTICALCOMPOSmONshallmeanacompositioncompxis^ 
5 one active ingredient, whereby the compodtion is amenable to investigation for a specified, 

efficadousoutcomeinamanm,al(forexample,andnotlimitation.ahuman>^ 

skillintheartwil]understandandappreciatethetec*niquesappn)priat^ 

an active ingredient has a desired efficacious outcome based upon the needs of the artisaa 

PLASMID shall mean the combination of a Vector and cDNA. Generally. aPlasmid is 
10 introduced into a Host CeU for the purpose of replication and/or expre^ion of the cDNA as a 
protein. 

STIMUI^TOorSTIMUIATTNG,inrelationd^^ 
a response is increased in the presence of a compound as opposed to in the absence of the 
compound. 

15 TRANSVERSE or TRANSVERSING, in reference to «ther a defined nucleic acid 

sequenceoradefined amino acidsequence.shallmeanthatthesequenceis located wito 
two different anddefinedregions. For example, in an amino acidsequence that 

moietiesinlength,where3ofthelOmoietiesareinthe™6regionofaGPCRandtheremaining 
7 moieties are in the IC3 region of the GPCR, the 10 amino acid moiety can be described as 
20 transversing the TM6 and IC3 regions of the GPCR. 

VECTORin reference to CDNA shall meanacircularDNA capable ofincorpor^g at 
least one cDNA and capable of incorporation into a Host Cell. 



nie order of the following sections is set forth for presentational efficiency and 



IS not 
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intended, nor should be construed, as a limitation on the disclosure or the claims to follow. 
A. Introduction 

The traditional study of receptors has always proceeded fiom the a priori assumption 
(historically based) that the endogenous ligand must first be identified before discovery could 
5 proceed to find antagonists and other molecules that could affect the receptor. Even in cases 
where an antagonist might have been known first, the search immediately extended to looking for 
the endogenous ligand. This mode of thinking has persisted in receptor research even after the 
discovery of constitutively activated recqjtors. What has not been heretofore recognized is that 
it is the active state of the receptor that is most useful for discovering agonists, partial agonists, and 
1 0 inverse agonists of the receptor. For those diseases which result fi-om an overly active receptor or 
an under-active receptor, what is desired in a therapeutic drug is a compound which acts to 
diminish the active state of a receptor or enhance the activity of the receptor, respectively, not 
necessarily a drug which is an antagonist to the endogenous hgand. This is because a compound 
that reduces or enhances the activity of the active receptor state need not bind at the same site as 
15 the endogenous ligand. Thus, as taught by a method of this invention, any search for therapeutic 
compounds should start by screening compounds against the ligand-independent active state. 

Screening candidate compounds against non-endogenous, constitutively activated GPCRs 
allows for the direct identification of candidate compounds which act at these cell surface 
receptors, without requiring any prior knowledge or use of the receptor's endogenous ligand. By 
20 detemiining areas withm the body where the endogenous version of such GPCRs are expi^sed 
and/or over-expressed, it is possible to determine related disease/disorder states which are 
associated with the expression and/or over-expression of these receptors; such an approoch is 
disclosed in this patent document. 
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B. Disease^Jisorder Identification and/or Selection 

Mostpreferably.inverse agonists to thenon-endogenous,constitati 
canbeidentified using thematerialsoftlris invention. Suchmv«^^ 

as lead compounds in drug discovery programs for tilting diseases related to these receptors. 
5 Because of the ability to directly identify inve,^ agonists, partial agonists or agonists to these 
receptors, thereby allowing for the development of phannaceutical compositions, a search, for 
diseases and disorder, associated with these receptors is possible. For example, scanning both 
diseased andnomud tissue samples for thepresenceofthese receptor nowbec^^^ 

academic exercise or one whichmightbepursued along thepathofidentifying. in the case of^ 
10 orphan receptor, an endogenous ligand. Tissue scans can be conducted across a broad range of 
healthy and diseased tissues. Such tissue scans provide a preferred first step in associating a 
specific recq)torwifli a disease and/or disorder. 

Preferably.theDNAsequenceoftheendogenousGPCRisusedtomakeapiobeforeitiier 
radiolabeled cDNA or RT-PCR identification of the expr^sion of the GPCR in tissue samples. 
15 The presence of a receptor in a diseased tissue, or the presence of the receptor at elevated or 
decreasedconcentrationsindiseasedtissuecompamltoanomialtissue,canbepre^^^ 
to identify a correlation with that disease. Receptors can equally well be localized to regions of 
organs by this technique. Based on the known fimctions of the specific tissues to which the 
receptor is locahzed, the putative fimctional role of the receptor can be deduced. 

20 C A "Human GPCR Proline Marker" Algoritiim and die Creation of 
Non-Endogenous, Constitntiveiy-Active Human GPCRs 

Among the many challenges feeing the biotechnology arts is the unpredictability in 

gleaninggeneticinfomiationfiom one speciesandcorrelating that infomwtio^ 
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- nowhere in this art does this problem evidence more annoying exacerbation than in the genetic 
sequences that encode nucleic acids and proteins. Thus, for consistency and because of the highly 
unpredictable nature of this art, the following invention is limited, in temis of mammals, to human 
GPCRs - applicability of this invention to other mammalian species, while a potential possibility, 
5 is considered beyond mere rote application. 

In general, when attempting to apply common "mles" from one related protein sequence 
to another or from one species to another, the art has typically resorted to sequence alignment, /.e, 
sequences are linearized and attempts are then made to find regions of commonality between two 
or more sequences. While usefiil, this approach does not always prove to result in meaningfiil 
1 0 information. In the case of GPCRs, while the general stmctural motif is identical for all GPCRs, 
the variations in lengths of the TMs, ECs and ICs make such alignment approaches from one 
GPCR to another diflScult at best. Thus, while it may be desirable to apply a consistent approach 
to, eg., constitutive activation from one GPCR to another, because of the great diversity in 
sequence length, fidelity, etc torn one GPCR to the next, a generally ^plicable, and readily 
15 successful mutational alignment approach is in essence not possible. In an analogy, such an 
approach is akin to having a traveler start a journey at point A by giving the traveler dozens of 
different maps to point B, without any scale or distance markers on any of the maps, and tiien 
asking the traveler to find tiie shortest and most efficient route to destination B only by using the 
maps. In such a situation, the task can be readily simplified by having (a) a common "place- 
20 marker" on each map, and (b) the ability to measure the distance from the place-marker to 
destination B - this, then, will allow the travel©- to select the most efficient from starting-point A 
to destination B. 

In essence, a feature of the invention is to provide such coordinates within human GPCRs 
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thatreadilyallowsforcreationofaconstitutivelyactive fonn of the human GPCRs. 

As those in the an appreciate, the transmembrane region of a cell is highly hydrophobic; 
thus, using standard hydrophobicity plotting techniques, those in the art ai« readily able to 
detennine the TM regions of a GPCR, and specifically TM6 (this same approach is also 
5 apphcabletodetemiiningtheECandlCregionsoftheGPCR). It has been discovered that within 

theTM6regionofhumanGPCRs,acommonpiolineresidue(generallynearthemiddleofTM6X 
acts as a constitutive activation "marker." By counting 15 amino acids fiom the proline marker, 
the 16* amino acid (which is located in the IC3 loop), when mutated dom its endogenous form 
to a non-endogenous foim. leads to constitutive activation of the receptor. For com^enience. we 
10 refer to this as the 'Human GPCR Proline Marker" Algorithm. Although the non-«ulogenous 
amino acid at this position can be any of the amino acids, most preferably, the non-endogenous 

amino add is lysine. While not wishing to be bound by any theoty, we believe that tW^ 

itself is unique and that the mutation at this location impacts the receptor to allow for constitutive 

activation. 

15 Wenotethat. for example, when the endogenous amino acid at the 16*position is aheady 

lysine (as is the case with GPR4 and GPR32), then in onler for X to be a non-endogenous amino 
acid, it must be other than lysine; thus, in those situations where the endogenous GPCR has an 
endogenous lysine residue at the 16* position, the non-endogenous version of that GPCR 
preferably incorporatesanaminoaddotherthanlysine,preferably alanine, Wsti^^ 

20 atthisposition. Offurther note, it has been detemuned that GPR4 appears to be linked to Gs and 
active in its endogenous form (data not shown). 

Because there are only 20 naturally occurring amino acids (although the use of non- 
naturally occurring amino acids is also viable), selection of a particular non-endogenous amino 
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acid for substitution at this 16* position is viable and allows for efficient selection of a non- 
endogenous amino acid that fits the needs of the investigator. However, as noted, the more 
preferred non-endogenous amino acids at the 16*** position are lysine, hisitidine, arginine and 
alanine, with lysine being most preferred. Those of ordinaiy skill in the art are credited with the 
ability to readily determine proficient methods for changing the sequence of a codon to achieve 
a desired mutation. 

It has also been discovered that occasionally, but not always, the proline residue marker 
wiU be preceded in TM6 by W2 (/.e, W2P'AA,5X) where W is tryptophan and 2 is any amino 
acid residue. 

Our discovery, amongst ottier things, negates the need for unpredictable and complicated 
sequence alignment approaches commonly used by the art. Indeed, the strength of our discovery, 
while an algorithm in nature, is that it can be applied in a facile manner to human GPCRs, with 
dexterous simplicity by those in the art, to achieve a unique and highly usefiil end-product, i.e„ a 
constitutively activated version of a human GPCR. Because many years and significant amounts 
of money will be required to determine the endogenous ligands for the human GPCRs that the 
Human Genome project is uncovering, the disclosed invention not only reduces the^ime necessary 
to positively exploit this sequence information, but at significant cost-savings. This approach tmly 
validates the importance of the Human Genome Project because it allows for the utilization of 
genetic information to not only understand the role of the GPCRs in, e.g., diseases, but also 
provides the opportunity to improve the hiunan condition. 
D. Screening of Candidate Compounds 

1. Generic GPCR screening assay techniques 

When a G protein receptor becomes constitutively active, it couples to a G protein (e.^., 
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Gq. Gs, Gi. Go)and stimulates release and subsequent binding ofGTP to theGpro^^^ TTieG 
protein then acts as a GTPase and slowly hydiolyzes the GTP to GDP. whereby the recqjtor. 
under normal conditions, becomes deactivated. However, constitutively activated receptors, 
including the non-endogenous. human constitutively active GPCRs of the present invention. 
5 continue to exchange GDP for GTP. A non-hydiolyzable analog of GTP. PSJGn^S. can be 
used to monitor enhanced binding to G proteins present on membrBnes which expr^s 
constitutively activatedreceptors. It isreportedthat[«S]GTPyS can beusedto 
coupling to membranes in the absence and presence of ligand. An example of this monitoring. 

among other examples weU-known and available to those in the art. was reported byTra^^^ 
10 Nahorski in 1995. The preferred use of this assay system is for initial screening of candidate 
compounds because the system is generically applicable to all G protein-coupled rotors 
regardless of the particular G protein that interacts with the intracellular domain of the receptor. 



B 2. Specific GPCR screening assay techniques 

C Once candidate compounds are identified using the "generic" G protein- 
coupled receptor assay (i.e., an assay to select compounds that are agonists, partial 
agonists, or inverse agonists), forther screening to confirm that the compounds have 
interacted at the receptor site is preferred. For example, a compound identified by the 
"generic" assay may not bind to the receptor, but may instead merely "uncouple" the G 
protein fiom the inti^cellular domain. 



15 



20 a. GsandGi. 



Gs stimulates the enzyme adenylyl cyclase. Gi (and Go), on the other hand, 
inhibit this enzyme. Adenylyl cyclase catalyzes the conversion of ATP to cAMP; thus, 
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constitutively activated GPCRs that couple the Gs protein are associated with increased 
cellular levels of cAMP. On the other hand, constitutively activated GPCRs that couple the 
Gi (or Go) protein are associated with decreased cellular levels of cAMP. See, generally, 
"Indirect Mechanisms of Synaptic Transmission," Chpt. 8, From Neuron To Brain (3'^ Ed.) 
5 Nichols, J.G. et al eds. Sinauer Associates^ Inc. (1992). Thus, assays that detect cAMP can 
be utilized to determine if a candidate compound is, e.g., an inverse agonist to the receptor 
(Le,, such a compound would decrease the levels of cAMP). A variety of approaches known 
in the art for measuring cAMP can be utilized; a most preferred approach relies upon the use 
of anti-cAMP antibodies in an ELIS A-based format. Another type of assay that can be 
10 utilized is a whole cell second messenger reporter system assay. Promoters on genes drive the 
expression of the proteins that a particular gene encodes. Cyclic AMP drives gene expression by 
promoting the binding of a c AMP-responsive DN A binding protein or transcription factor (CREB) 
which then binds to the promoter at specific sites called cAMP response elements and drives the 
expression of the gene. Reporter systems can be constmcted which have a promoter containing 
15 multiple cAMP response elements before the reporter gene, e.g., p-galactosidase or lucifeiase. 
Thus, a constitutively activated Gs-Iinked receptor causes the accumulation of cAMP that then 
activates the gene and expression of the reporter protein. The reporter protein such as p- 
galactosidase or luciferase can then be detected using standard biochemical assays (Chen et al. 
1995). With respect to GPCRs that link to Gi (or Go), and thus decrease levels of cAMP, an 
20 cqjproach to the screening of, e.^., inverse agonists, based upon utilization of receptors that link to 
Gs (and thus increase levels of cAMP) is disclosed in the Example section witii respect to GPRl 7 
andGPR30. 
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b. Go and Gq. 

Gq and Go are associated with activation of the enzyme phospholipase C, 
which in turn hydrolyzes the phosphoUpid PIP^, releasing two intracellular messengers: 
diacycloglycerol(DAG)andinistol 1.4,5-triphoisphate(IP3). Increased accimiulation of IP3 
5 is associated with activation of Gq- and Go-associated receptors. See. generally, "Indirect 
Mechanisms of Synaptic Transmission," Chpt. 8, From Neuron Tn Rr^in (3rd gd.) Nichols, 
J.G. et al eds. Sinauer Associates, Inc. (1992). Assays that detect IP3 accimiulation can be 
utilized to detennine if a candidate compound is, e.g., an inverse agonist to a Gq- or Go- 
associated receptor (i.e.. such a compound would decrease the levels of IP3). Gq-associated 
10 receptors can also been examined using an API reporter assay in that Gq-dependent 
phospholipase C causes activation of genes containing API elements; thus, activated Gq- 
associated receptors will evidence an increase in the expression of such genes, whereby 
inverse agonists thereto will evidence a decrease in such expression, and agonists will 
evidence an increase in such expression. Commercially available assays for such detection 
IS are available. 

E. Medicinal Chemistry 

Generally, but not always, direct identification of candidate compounds is preferably 

conducted in conjunction with compounds generated viacombinatorial chemistry techniques, 
whereby thousands of compounds are randomly prepared for such analysis. Generally, the 
20 results of such screening will be compounds having unique core structures; thereafter, these 
compounds are preferably subjected to additional chemical modification around a preferred 
core structure(s) to fiirther enhance the medicinal properties thereof. Such techniques are 
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known to those in the art and will not be addressed in detail in this patent document. 

F. Pharmaceutical Compositions 

Candidate compounds selected for further development can be fomiulated into 
phannaceutical compositions using techniques well known to those in the art. Suitable 
5 phaimaceutically-acceptable carriers are available to those in the art; for example,5eeRemington's 
Phamiaceutical Sciences, 16* Edition, 1980, Mack Publishing Co., (Oslo et al., eds,) 

G. Other Utility 

Although a preferred use of the non-endogenous versions of the disclosed human GPCRs 
is for the direct identification of candidate compounds as inverse agonists, agonists or partial 

10 agonists (preferably for use as phamfiaceutical agents), these receptors can also be utilized in 
research settings. For example, in vitro and in vivo systems incorporating these receptors can be 
utilized to further elucidate and understand the roles of the receptors in the human condition, both 
normal and diseased, as well understanding the role of constitutive activation as it applies to 
understanding the signaling cascade. A value in these non-endogenous receptors is that their 

15 utility as a research tool is enhanced in that, because of their unique features, the disclosed 
receptors can be used to understand the role of a particular receptor in the human body before the 
endogenous ligand therefor is identified. Other uses of the disclosed receptors will become 
apparent to those in the art based upon, inter alia, a review of this patent document. 

EXAMPLES 

20 The following examples are presented for purposes of elucidation, and not limitation, 

of the present invention. Following the teaching of this patent document that a mutational 
cassette may be utilized in the ICS loop of human GPCRs based upon a position relative to 
a proline residue in TM6 to constitutively activate the receptor, and while specific nucleic acid 
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and amino acid sequences are disclosed herein, those of ordinary skill in the art are credited 
with the ability to make minor modifications to these sequences while achieving the same or 

substantially similar results reported below. Particular approaches to sequencemutatio^ 

within the purview of the artisan based upon the particular needs of the artisan. 

5 Example 1 

Preparation of £nd(^enous Human GPCRs 

A variety of GPCRs were utilized in the Examples to foUow. Some endogenous human 
GPCRs were graciously provided in expression vectors (as acknowledged below) and other 
endogenous human GPCRs were synthesized de novo using publicly-available sequence 
10 information. 

1. GPRl (GenBank Accession Number: U13666) 

The human cDNA sequence for GPRl was provided in pRcCMV by Brian 
0'Dowd(Universityofroionto). GPRl cDNA(1.4kBfi3gment)wasexcisedfiomthepRcCMV 
vector as a Ndel-Xbal fiagment and was subcloned into the Ndel-Xbal site of pCMV vector {see 
\5 Figures). Nucleic acid (SEQ.ID.Na: 1) and amino acid (SEQ.ID.NO.: 2) sequences for human 
GPRl were thoeafter detemuned and verified. 

2. GPR4 (GenBank Accession Numbers: L36148, U35399, U21051) 
The human cDNA sequence for GPR4 was provided in pRcCMV by Brian 

0'Dowd(Universityofroronto). GPRl cDNA(1.4kBfiagment)wasexcisedfix)mthepRcCMV 
20 vector as an Apal(blunted)-Xbal fiagment and was subcloned (witii most of the 5' untranslated 
regionremoved)intoHindin(blunted)-XbaIsiteofpCMV vector. Nucleic acid (SEQ.ID.NO.: 3) 
and amino acid (SEQ.ID.NO.: 4) sequences for human GPR4 were thereafter detemiined and 
voified. 

3. GPRS (GenBank Accession Numben L36149) 
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The cDNA for human GPRS was generated and cloned into pCMV expression 
vector as follows: PGR was perfonned using genomic DNA as template and rTth polymerase 
(Peridn Elmer) with the buffer system provided by the manufacturer, 0.25 ^M of each primer, and 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 64 °C 
for Irain; and 72 °C for 1 .5 min. The 5' PGR primer contained an EcoRI site with the sequence: 
5'.TATGAATTGAGATGCTGTAAACGTGGGTGC-3' (SEQ.ID.NO.: 5) 
and the 3' primer contained BamHI site with the sequence: 
5'-TCCGGATGGAGGTGGACGTGGGGGTGGACC-3' (SEQ.ID.NO.: 6). 
The 1 . 1 kb PGR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 
site of PGMV expression vector. Nucleic acid (SEQ.ID.NO.: 7) and amino acid (SEQ.ID.NO.: 
8) sequences for human GPRS were thereafter determined and verified. 

4. GPR7 (GenBank Accession Number: U22491) 
The cDNA for human GPR7 was generated and cloned into pCMV expression 
vector as follows: PGR condition- PGR was performed using genomic DNA as template and rTth 
polymerase (Peridn Elmer) with the buffer system provided by the manufacturer, 0.2S ^iM of each 
primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°G for 
1 min; 62°G for Imin; and 72*'G for Imin and 20 sec. The 5' PGR primer contained a Hindlll site 
with the sequence: 

5'-GGAAGGTTGGGGGAGGGGAGGTGGGGGGCT-3' (SEQ.ID.NO.: 9) 
and the 3 ' primer contained a BamHI site with the sequence: 

5'-GGGGATGGGGAGGCTGGGGGAGtGAGGGTGG-3' (SEQ.ID.NO.: 10). 

The 1 . 1 kb PGR firagment was digested with Hindm and BamHI and cloned into Hindm-BamHI 

site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 1 1) and amino acid (SEQ.ID.NO.: 
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12) sequences for human GPR7 were thereafter determined and verified. 

5. GPRS (GenBaok Accession Nomben U22492) 
The cDNA for human GPR8 was generated and cloned into pCMV expression 
vector as follows: PGR was perfomied using genomic DNA as template and rTth polymerase 
5 (PeiidnEhner) withthebuflFersystempiDvidedby thenmufacturer. 0.25 nM of each primer, and 
0.2 mMofeachofthe 4 nucleotides. The cycle condition was 30 cycles of: 94»C for lmin;62°C 
for Imin; and 72 "C for Imin and 20 sec. The 5' PGR primer contained an EcoRI site with the 
sequoice: 

5'-CGGAATTCGTCAACGGTCCCAGCTACAATG-3' (SEQ.ID.NO.: 13). 
10 and the 3' primer contained a Bamm site with the sequence: 

5'-ATGGATCCCAGGCCCTTCAGCACCGCAATAT-3'(SEQ.ID.NO.: 14). 
Hie I.l kb PGR fiagment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 
site of PCMV expression vector. All 4 cDNA clones sequenced contained a possible 
polymorphism involving a change of amino add 206 fiom Arg to Gin. Aside fiom this 
15 difiFerence.nucleicacid(SEQ.ID.NO.: 15) and amino acid (SEQ.ID.NO.: 16) sequences for human 
GPR8 were thereafter detamined and verified. 

6. GPR9 (GenBank Accession Numben X95876) 

The cDNA for human GPR9 was generated and cloned into pGMV expression 

vector as follows: PGR wasperformedusingaclone(providedbyBrianO'Dowd) as template and 
20 pfu polymerase (Stratagene) with the buffer system provided by the manufecturer supplemented 

with 10%DMSO.0.25nMofeachprimer.and0.5inMofeachofthe4nucleotides. ITiecycle 
condition was 25 cycles of: 94''G for 1 min; Se'C for Imin; and.72 »G for 2.5 min. The 5' PGR 
primer contained an EcoRI site with the sequence: 



wo 00/22129 



PCTAJS99/23938 



-31 - 

5 '-ACGAATTC AGCCATGGTCCTTGAGGTGAGTGACCACC AAGTGCTAAAT-3 ' 
(SEQ.ID.no.: 17) 

and the 3 ' primer contained a BamHI site with the sequence: 
5'-GAGGAT(XTGGAATGCGGGGAAGTCAG-3' (SEQ.ID.NO.: 18). 
5 The 1 .2 kb PGR fragment was digested with EcoRI and cloned into EcoRI-Smal site of PCMV 
expression vector. Nucleic acid (SEQ.ID.NO.: 19) and amino acid (SEQ.ID.NO.: 20) sequences 
for human GPR9 were thereafter detennined and verified. 

7. GPR9-6 (GenBank Accession Number: U4S982) 
The cDNA for human GPR9-6 was generated and cloned into pCMV expression 
10 vector as follows: PGR was perfomied using genomic DNA as template and rTth polymerase 
(Peridn Elmer) with the buffer system provided by the manufacturer, 0.25 (iM of each primer, and 
0.2 luM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94^C for 1 min; 62^C 
for Imin; and 72 °C for 1 min and 20 sec. The 5' PGR primer was kinased with the sequence: 
5'-TTAAGCTTGAGCTAATGCGATCTTGTGTGC-3' (SEQ.ID.NO.: 21) 
1 5 and the 3 ' primer contained a BamHI site with the sequence: 

5'-TTGGATGCAAAAGAACGATGGACCTCAGAG-3' (SEQ.ID.NO.: 22). 
The 1.2 kb PGR fragment was digested with BamHI and cloned into EcoRV-BamHI site of 
pGMV expression vector. Nucleic acid (SEQ.ID.NO.: 23) and amino acid (SEQ.ID.NO.: 24) 
sequences for human GPR9-6 were thereafter detennined and verified. 
20 8. GPRIO (GenBank Accession Number: U32672) 

The human cDNA sequaice for GPRIO was provided in pRcGMV by Brian 
O'Dowd (University of Toronto). GPRIO cDNA (1.3kB fluent) was excised fiom the 
pRcCMV vector as an EcoRI-Xbal fiagment and was subcloned into EcoRI-Xbal site of pGMV 
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vector. Nucleic acid (SEQ.ID.NO.: 25) and amino acid (SEQ.ID.NO.: 26) sequences for human 
GPRl 0 were thereafter detemiined and verified. 

9. GPR15 (GenBank Accession Number: U34806) 
The human cDNA sequence for GPR15 was provided in pCDNA3 by Brian 
5 O'Dowd (University of Toronto). GPR1.5 cDNA (1.5kB fiagment) was excised fiom the 
PCDNA3 vector as a Hindm-Bam Augment and was subcloned into HindlE-Bam site of pCMV 
vector. Nucleic acid (SEQ.ID.NO.: 27) and amino acid (SEQ.ID.NO.: 28) sequences for human 
GPR15 were thereafter determined and verified. 

10. GPRl 7 (GenBank Accession Number: Z94154) 

'^^NA for human GPR17 was generated and cloned into pCMV expression 
vector as follows: PGR was performed using genomic DNA as template and rTth polymerase 

(PerkinElmer)withthebuflFersystemprovidedbythemanufecturw,0.25MMofeachprimer,and 
0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 940C for 1 min; Se'C for 
Imin and 72 "C for 1 min and 20 sec. The 5' PGR primer contained an EcoRI site with the 
15 sequence: 

5'-CrAGAArrCTGACTCCAGCCAAAGCATGAAT-3' (SEQ.ID.NO.: 29)andthe3' primer 
contained a BamHI site with the sequence: 

5'-GCTGGATCCTAAACAGTCrGCGCTCGGCCT-3' (SEQ.ID.NO.: 30). 
The I.l kb PGR fiagment was digested with EcoRl and BamHI and cloned into EcoRI-Bamffl 
20 site of pCMV expression vector. Nucleic acid (SEQJD.NO.: 31) and amino acid (SEQ.ID.NO.: 
32) sequences for human GPR17 were thereafter determined and verified. 

11. GPR18 (GenBank Accession Number: L42324) 

The cDNA for human GPR18 was generated and cloned into pCMV expression 
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vector as follows: PCR was perfomied using genomic DNA as template and rTth polymerase 
(Peikin Elmer) with the buffer system provided by the manufacturer, 0.25 ^iM of each primer, and 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 54°C 
for Imin; and 72 ^'C for Imin and 20 sec. The 5' ?CR primer was kinased with the sequence: 
5 5'-ATAAGATGATCACCCTGAACAATCAAGAT -3' (SEQ.ID,NO.: 33) 
and the 3' primer contained an EcoRI site with the sequence: 
5'.TCCGAATTCATAACATTTCACTGTTTATATTGC-3' (SEQ.ID.NO.: 34). 
The 1.0 kb PCR fiagment was digested with EcoRI and cloned into blunt-EcoRI site of pCMV 
expression vector. All 8 cDNA clones sequenced contained 4 possible polymorphisms involving 
10 changes of amino acid 12 from Thr to Pro, amino acid 86 from Ala to GIu, amino acid 97 from 
He to Leu and amino acid 310 from Leu to Met. Aside from these changes, nucleic acid 
(SEQ.ID.no.: 35) and amino acid (SEQ.ID.NO.: 36) sequences for human GPRl 8 were thereafter 
determined and verified. 

12. GPR20 (GenBank Accession Number: U66579) 
1 5 The cDNA for human GPR20 was generated and cloned into pCMV expression 

vector as follows: PCR was performed using genomic DNA as template and rTth polymerase 
(Perkin Elmer) with the buffer system provided by the manufacmrer, 025 jiM of each primer, and 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94*^0 for 1 min; 62°C 
for Imin; and 72 °C for 1 min and 20 sec. The 5* PCR primer was kinased with the sequence: 
20 5'-CCAAGCTTCCAGGCCTGGGGTGTGCTGG-3' (SEQ.ID.NO.: 37) 
and the 3' primer contained a BamHI site with the sequence: 
5'«ATGGATCCTGACCTTCGGCCCCTGGCAGA-3' (SEQ.ID.NO.: 38). 
The 1.2 kb PCR fiagment was digested with BamHI and cloned into EcoRV-BamHI site of 
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PCMV expression vector. Nucleic acid (SEQ.ID.NO.: 39) and amino acid (SEQJD.Na: 40) 

sequences for human GPR20 were thereafter detemiined and verified. 

13. GPR21 (GenBank Accession Number: U6^80) 
The cDNA for human GPR21 was generated and cloned into pCMV expression 
5 vector as follows: PCR was performed using genomic DNA as template and rTth polymerase 

(PeikinEbner)with the buflFer system provided by themanufacturer. 0.25 ^Mofeachprimer.a^^ 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94»C for 1 min; 62»C 
for Imin; and 72 "C for 1 min and 20 sec. The 5' PCR primer was kinased with the sequence: 
5'-GAGAATTCACTCCTGAGCTCAAGATGAACT-3' (SEQ.ID.NO.: 41) 
10 and the 3 'primer contained a Bamffl site with the sequence: 

5'-CGGGATCCCCGTAACTGAGCCACTrCAGAT-3' (SEQ.ID.Na: 42). 
The 1.1 kb PCR fiagment was digested with BamHI and cloned into EcoRV-BamHI site of 
pCMV expression vector. Nucleic acid (SEQ.n).NO.: 43) and amino acid (SEQ.ID.NO.: 44) 
sequences for human GPR21 were thereafter detemiined and verified. 
^ ^ GPR22 (GenBank Accession Number: U66581) 

The cDNA for human GPR22 was generated and cloned into pCMV expression 
vector as foUows: PCR was performed using genomic DNA as template and rTth polymerase 

(PerkinEhner)withthebufiFersystemprovidedbythemanufecturer,0.25^Mofeachprimer,and 
02mMofeachofthe4nucleotides. Thecycleconditionwas30cyclesof:94«'Cfor 1 min;50»C 
20 for Imin; and 72'C for 1.5 min. The 5' PGR primer was kinased with the sequence: 
5'-TCCCCCGGGAAAAAAACCAACTGCTCCAAA-3' (SEQ.ID.NO.: 45) 
and the 3' prima- contained a BamHI site with the sequence: 

5'-TAGGATCCATTTGAATGTGGATTrGGTGAAA-3' (SEQ.ID.Na: 46). 
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The 1.38 kb PGR fragment was digested with BamHI and cloned into EcoRV-BamHI site of 
pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 47) and amino acid (SEQ.ED.NO.: 48) 
sequences for human GPR22 were thereafter determined and verified. 

IS. GPR24 (GenBank Accession Number: U71092) 

The cDNA for human GPR24 was generated and cloned into pCMV expression 
vector as follows: PGR was performed using genomic DNA as template and rTth polymerase 
(Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 |liM of each primer, and 
02 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94**C for 1 min; 56^C for 
Imin; and 72 "^C for 1 min and 20 sec. The 5' PCR primer contains a Hindm site with the 
sequence: 

5'-GTGAAGCTTGCCTCTGGTGCCTGCAGGAGG-3' (SEQ.ID.NO.: 49) 

and the 3 * primer contains an EcoRI site with the sequence: 

5*.GCAGAATTCCCGGTGGCGTGTTGTGGTGCCC-3' (SEQ.ID.NO.: 50). 

The 1.3 kb PCR fragment was digested with Hindlll and EcoRI and cloned into HindlH-EcoRI 

site of pCMV expression vector. The nucleic acid (SEQ.ID.NO.: 51) and amino acid sequence 

(SEQ.E).NO.: 52) for human GPR24 were thereafter determined and verified. 

16. GPR30 (GenBank Accession Number: U63917) 

The cDNA for human GPR30 was generated and cloned as follows: the coding 
sequence of GPR30 (1 128bp in length) was amplified from genomic DNA using the primers: 
5'-GGCGGATCCATGGATGTGACTTCCCAA-3' (SEQ.ID.NO.: 53) and 
5'-GGCGGATCCCTACACGGCACTGCTGAA.3' (SEQ.ID.NO.: 54). 
The amplified product was then cloned into a commercially available vector, pCR2. 1 (Invitrogen), 
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using a "TOPO-TA Cloning Kit" (Invitiogen, #K45()0^1), foUowin^ 
The full-length GPR30 insert was liberated by digestion with BamHl . separated fiom the vector 
by agarose gel electrophoresis, andpurifiedusingaSephaglasBandpi^™ 
9285-01)followinginanufecturerinstructions. The nucleic acid (SEQ.ID.NO.: 55) and amino add 
5 sequence (SEQJD.no.: 56) for human GPR30 were thereafter detennined and verified. 
17. GPR31 (GenBank Accession Number: U65402) 
The cDNA for human GPR31 was generated and cloned into pCMV expression 
vector as foUows: PGR was performed using genomic DNA as template and rTth polymerase 

(Perkin Elmer)withthebuffer system provided by themanufecturer,0.25^Mofeachpri^^^ 
10 0.2 mMofeachofthe 4 nucleotides. The cycle condition was 30 cycles of: 94»C for 1 min;58''C 
for Imin; and 720C for 2 min. The 5' PGR primer contained an EcoRI site with the sequence: 
5'-AAGGAATrCACGGCCGGGTGATGCCATrCCC-3' (SEQ.ID.NO.: 57) 
and tile 3' primer contained a BamHI site with the sequence: 

5'-GGTGGATCCATAAACACGGGCGTrGAGGAC -3' (SEQ.ID.NO.: 58). 
15 The 1.0 kb PGR fiagment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 
siteofpGMV expression vector. Nucleic acid (SEQ.ID.NO.: 59) and amino acid (SEQ.ID.NO.: 
60) sequences for human GPR31 were thereafter determined and verified. 

18. GPR32 (GenBank Accession Number: AF045764) 

The cDNA for human GPR32 was generated and cloned into pCMV expression 
20 vector as foUows: PGR was performed using genomic DNA as template and rTth polymerase 
(PeridnElmer)withthebufrersystemprDvidedbythemanu^acturer,025^Mofea^^ 
0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94»G for 1 min; 56»C for 
Imin; and 72 'C for 1 min and 20 sec. The 5' PGR primer contained an EcoRI site with the 
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sequCTce: 

5'-TAAGAATTCCATAAAAATTATGGAATGG-3' (SEQ.ID.NO.:243) 
and the 3' primer contained a BamHI site with the sequence: 
5'.CCAGGATCCAGCTGAAGTCTTCCATCATTC-3' (SEQ.ED.NO.: 244). 
5 The L 1 kb PGR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 
site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 245) and amino acid (SEQ.ID.NO.: 
246) sequences for human GPR32 were thereafter detomined and verified. 

19. GPR40 (GenBank Accession Number: AF024687) 
The cDNA for human GPR40 was generated and cloned into pCMV expression 
10 vector as follows: PGR was performed using genomic DNA as template and rTth polymerase 
(Peridn Elmer) with the buffer system provided by the manufacturer, 0.25 ^iM of each primer, and 
02 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94"C for 1 min, 65°C for 
Imin and 72 °C for 1 min and 10 sec. The 5' PGR primer contained an EcoRI site with the 
sequence 

15 5'-GGAGAATTGGGCGGCCCCATGGACCTGCCCCC-3' (SEQ.ID.NO.: 247) 
and the 3 * primer contained a BamHI site with the sequence 
5'-GGTGGATCGCCCGAGCAGTGGCGTTACTTC-3' (SEQ.ID.NO.: 248). 
The 1 kb PGR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site 
ofpGMV expression vector. Nucleic acid (SEQ.ID.NO.: 249) and amino acid (SEQ.ID.NO.: 250) 
20 sequences for human GPR40 were thereafter deteraiined and verified. 

20. GPR41 (GenBank Accession Number AF024688) 
The cDNA for human GPR41 was generated and cloned into pGMV expression 
vector as follows: PGR was perfomied using genomic DNA as template and rTth polymerase 



wo 00/22129 



PCT/US99/23938 



-38- 

(PeridnElmer)withU,ebuffersystem|m>videdbythema„ufe^ 

0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of 94-'C for 1 min, 65^C for 
Imin and 72 -C for 1 min and 10 sec. The 5' PCR primer contained an Hindm site with the 
sequence: 

5 5^CTCAAGCTTACrCTCTCTCACCAGTGC3CCAC-3' (SEQ.1D.no.: 251) 
and the 3' primtT was Idnased with the sequence 

5'.CCCTCCTCCCCCGGAGGACCTAGC.3' (SEQ.ID.NO.: 252). 

The 1 l^bPCRfegmentwasdigestedwithHindmandclonedintoHindm-bluntsiteofpC^ 
expression vector. Nucleic acid (SEQ.IDJ^O.: 253) and amino acid (SEQ.ID.NO.: 254) 
10 sequences forhumanGPR41werethereafterdetenninedandverified. 

21. GPR43 (GenBank Accession Number AFt)24690) 
The cDNA for human GPR43 was generated and cloned into pCMV expression 
vector as follows: PCR was performed using genomic DNA as template and rTth polymerase 
(PerkinEbner)withthebufrersystempravidedbythemanufactui^^ 
15 02 mM of each 4 nucleotides. Hie cycle condition was 30 cycles of: 94''C for 1 min; 6yC for 
Imin; and 72 "C for 1 min and 10 sec. Ihe 5' PCR primer contains an Hindm site with the 
sequence: 

5'-TTrAAGCTTCCCCTCCAGGATGCTGCCGGAC-3' (SEQ.E).NO.: 255) 
and the 3' primer contained an EcoRI site with the sequence: 
20 5'-GGCGAATTCTGAAGGTCCAGGGAAACTGCTA-3'(SEQ.IDJsJ0.256). 

ThelkbPCRfiagmentwasdigestedwithHindraandEcoRIand cloned into Hindffl-EcoRI site 

ofpCMVexprcssion vector. Nucleicacid(SEQ.ID.NO.:257)andaminoacid(SEQ.ID.Na:258) 
sequences for human GPR43 were thereafter determined and verified 
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22. APJ (GenBank Accession Number: U03642) 

Human APJ cDNA (in pRcCMV vector) was provided by Brian O'Dowd 
(University of Toronto). The human APJ cDNA was excised torn the pRcCMV vector as an 
EcoRI-Xbal (blunted) fragment and was subcloned into EcoRI-Smal site of pCMV vector. 
5 Nucleic acid (SEQ.ID.NO.: 61) and amino acid (SEQ.ID.NO.: 62) sequences for human APJ 
were thereafter determined and verified. 

23. BLRl (GenBank Accession Number: X68149) 

The cDNA for human BLRl was generated and cloned into pCMV expression 
vector as follows: PCR was performed using thymus cDNA as template and rTth polymerase 
1 0 (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 ^M of each primer, and 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 62**C 
for Imin; and 72 for 1 min and 20 sec. The 5* PCR primer contained an EcoRI site with the 
sequence: 

5'-TGAGAATTCTGGTGACTCACAGCCGGCACAG-3' (SEQ.ID.NO.: 63): 

1 5 and the 3 ' primer contained a BamHI site with the sequence: 

5'-GCCGGATCCAAGGAAAAGCAGCAATAAAAGG-3' (SEQ.ID.NO.: 64). The 1.2 kb PCR 
fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site of pCMV 
expression vector. Nucleic acid (SEQ.ID.NO.: 65) and amino acid (SEQ.ID.NO.: 66) sequences 
for human BLRl were thereafter detennined and verified. 

20 24. CEPR (GenBank Accession Number: U77827) 

The cDNA for human CEPR was generated and cloned into pCMV expression 
vector as follows: PCR was performed using genomic DNA as template and rTth polymerase 
(Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 jiM of each primer, and 
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0.2inMofeachofthe4nucleotides.Thecycleconditionwas30cyclesof:94'>Cfo^ 

for Imin; and 72 «C for 1 min and 20 sec. TT,e 5 ' PGR primer was kinased with the sequence: 

5'-CAAAGCTTGAAA(3CTGCACGGTGCAGAGAC-3' (SEQ.ID.NO.:67) 

and the 3' primer contained a BamHI site with the sequence: 

J 5'-GCGGATCCCGAGTCACACCCTGGCrcGGCC-3' (SEQ.ID.NO.: 68). 

The 1.2 kb PGR fragment was digested with BamHI and cloned into EcoRV-BamHI site of 
pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 69) and amino acid (SEQJD.NO.: 70) 
sequences for human CEPR were thereafter detemiined and verified. 

25. EBIl (GenBank Accession Number: L3I581) 

THe CDNA for human EBIl was generated and cloned into pCMV expression 
vector as follows: PGR was perfomied using thymus cDNA as template and rTth poljonerase 
(PerkinElmer)withthebuffosystemprovidedbythemanufecturer,0.25MMofeachpri^^^ 
02mMofeachofthe4nucleotides. The cycle condition was 30 cycles of: 940G for 1 min; 62«'C 
for imin; and 72 "C for 1 min and 20 sec. TT,e 5 ' PGR prim^ contained an EcoRI site with the 
sequence: 

5'-ACAGAATTGGTGTGTGGTrTTACCGCGGAG-3' (SEQ.ID.NO.: 71) 
and the 3' primer contained a BamHI site with the sequence: 

5'-CTCGGATGGAGGGAGAAGAGTGGGGTATGG-3' (SEQ.ID.NO.: 72). 
Hie U kb PGR fiagment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 
site of PGMV expression vector. Nucleic acid (SEQ.IDm: 73) and amino acid (SEQ.ID.NO.: 
74) sequences fpr human EBIl were thereafter detemiined and verified. 

26. EBI2 (GenBank Accession Number: L08177) 

TTie CDNA for human EBI2 was generated and cloned into pGMV expression 



wo 00/22129 



PCT/US99/23938 



-41 - 

vector as follows: PGR was performed using cDNA clone (graciously provided by Kevin Lynch, 
University ofVirginia Health Sciences Center; the vector utilized was not identified by the source) 
as template and pfii polymerase (Stratagene) with the buffer system provided by the manufacturer 
supplemented mih 10% DMSO, 0.25 ^iM of each primer, and 0.5 mM of each of the 4 
5 nucleotides. The cycle condition was 30 cycles of: 94"C for 1 min; eO^'C for Imin; and 72**C for 
1 min and 20 sec. The 5' PGR primer contained an EcoRI site with the sequmce: 
5'-CTGGAATTCACCTGGACCACCACCAATGGATA-3' (SEQ.ID.NO.: 75) 
and the 3' primer contained a BamHI site with the sequence 
5'-CTCGGATCCTGCAAAGTTTGTCATACAGTT-3' (SEQ.E).NO.: 76). 
1 0 The 1 .2 kb PGR firagmait was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 
site of pGMV expression vector. Nucleic acid (SEQ.ID.NO.: 77) and amino acid (SEQ.ID.NO.: 
78) sequences for human EBI2 were thereafter detemmied and verified. 

27. ETBR-LP2 (GenBank Accession Number: D38449) 
The cDNA for human ETBR-LP2 was generated and cloned into pGMV 
15 expression vector as follows: PGR was performed using braki cDNA as template and rTth 
polymerase (Peikin Elmer) with the bufifer systan provided by the manufacturer, 0.25 |iM of each 
primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94®G for 
1 min; 65°C for Imin; and 72 "G for 1.5 min. The 5' PGR contained an EcoRI site with the 
sequence: 

20 5'.GTGGAATTCTGCTGGTCATGCAGGGATGGGG -3' (SEQ.ID.NO.: 79) 
and the 3' primer contained a BamHI site with the sequence: 
5'.GGTGGATCGGGAGGGGTAGTGGGGGGTGAG-3' (SEQ.ID.NO.: 80), 
The 1 .5 kb PGR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 
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Site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 81) and amino acid 

(SEQmNO.:82)sequencesforhumanETBR-LP2 were thereafterdeterniined and verifi^^ 
28. GHSR (GenBaok Accession Numben U60179) 

TTie cDNA for human GHSR was generated and cloned into pCMV expression 
5 vector as follows: PGR was perfom,ed using hippocampus cDNA as template and TaqPlus 
ft«.isionpolymerBse(Stmagene) withthebufiFer systemprovidedby themanu^^^ 0.25 jxM 
ofeachprimer.and0.2mMofeachofthe4nucleotides. The cycle condition was 30 cycles of: 
94^ for 1 min; 68«C for Imin; and 72-0 for 1 min and 10 sec. For first round PGR, the 5' PGR 
primer sequence was: 

10 5'-ATGTGGAAGGCGACGCCCAGCG-3' (SEQ.ID.no.: 83) 
and the 3' primer sequence was: 

5'-TCATGTATrAATACTAGATrcr.3' (SEQ.IDm: 84). 

Two microUtersofthefotroundPCRwasusedas template forthesecondroundPCRwh«^ 
5' primer was kinased with sequence: 

15 5'-TACCATGTGGAACGCGACGCGCAGCGAAGAGGCGGGGT-3'(SEQ.ID.NO.:85) 
and the 3' primer contained an EcoRI site with the sequence: 

5'-CGGAATrCATGTATTAATACrAGATTCrGTCCAGGCGCG-3'(SEQ.ID.NO.:86). 
The 1.1 kb PGR ftagment was digested with EcoRI and cloned into blunt-EcoRI site of pCMV 

expression vector. Nucleic acid(SEQ.m.NO.: 87) and amino add (SEQ.m.NO.:88)s^^^^ 
20 for human GHSR were thereafter determined and verified. 

29. GPCR-CNS (GeoBank Accession Number: AIiD17262) 
The CDNA for human GPGR-GNS was generated and cloned into pCMV 
expression vector as follows: PGR was perfomied using bmin cDNA as template and rTth 
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polymerase (Peridn Elmer) with the bufier system provided by the manufectuier, 0.25 jxM of each 
prima-, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 
1 min; 65»C for Imin; and 72-0 for2 min. The 5' PCRprimer contained a Hindm site with the 
sequence: 

5 5'-GCAAGCTTGTGCCCTCACCAAGCCATCK:GAGCC-3' (SEQ.ID.no.: 89) 
and the 3' primer contained an EcoRI site with the sequence: 
5'-CGGAATTCAGCAATGAGTTCCGACAGAAGC-3' (SEQ.ID.no.: 90). 
The 1.9 kb PGR fiagment was digested with Hindm and EcoRI and cloned into HindlH-EcoRI 
site of pCMV expression vector. All nine clones sequenced contained a potential polymorphism 
10 involving a S284C change. Aside fiom this difference, nucleic acid (SEQJD.NO.: 91) and amino 
acid (SEQ.ID.NO. : 92) sequences for human GPCR-CNS were thereafter determined and verified. 
30. GPR'NGA (GaiBank Accession Number: U55312) 
The cDNA for human GPR-NGA was generated and cloned into pCMV 
expression vector as follows: PGR was perfonned using genomic DNA as template and rTth 
1 5 polymHase (Peridn Elmer) with the buffer system provided by the manufacturer, 0.25 yM of each 
prima:, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of 94'>C for 

1 min,56»Cfor lminand72»Cfor 1.5min. The 5 'PGR primer contained an EcoRI site with the 
sequence: 

5'-CAGAATTCAGAGAAAAAAAGTGAATATGGTTnT-3' (SEQ.ID.NO.: 93) 
20 and the 3 ' primer contained a BamHI site with the sequence: 

5'-TTGGATCCCTGGTGCATAACAATTGAAAGAAT-3' (SEQJD.NO.: 94). 

The 1 .3 kb PGR fiiagment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 

site of pCMV expression vector. Nucleic add (SEQ.ID.NO.: 95) and amino acid (SEQ.ID.NO.: 
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96) sequences for human GPR-NGA were thereafter detennined and verified. 
31. H9 (GnBaBk Accession Number: U52219) 

TTie cDNA for human HB954 was generated and cloned into pCMV expression 
vector as follows: PGR was perfomed using pituitaiy cDNA as template and rTth polymerase 
5 (PerkinElmer)withthebufiFersystemprovidedbythemanufacturer,025MMof^^ 

02 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94''C for 1 min, 62»C for 
Imin and 72 «C for 2 min. The 5' PCRprimer contains a Hindm site with the sequence: 
5'-GGAAAGCTTAACGATCCCCAC3GAGCAACAT-3' (SEQ.ID.NO.: 97) 
and the 3' primer contains a BamHI site with tiie sequence: 

10 5'-CTGGGATCCTACGAGAGCATTnTCACACAG-3' (SEQ.ID.NO.: 98). 

The 1.9 kb PGR fragment was digested with Hindlll and BamHI and cloned into HindHI- 
BamHI site of pCMV expression vector. When compared to the published sequences, a 
different isofonn with 12 bp in fiame insertion in the cytoplasmic tail was also identified and 
designated "H9b." Both isoforms contain two potential polymoiphisms involving changes 
15 of amino acid P320S and amino acid G448A. Isofomi H9a contained another potential 
polymoiphism of amino acid S493N, while isoform H9b contained two additional potential 
polymorphisms involving changes of amino acid I502T and amino acid A532T 
(corresponding to amino acid 528 of isofonn H9a). Nucleic acid (SEQ.ID.NO.: 99) and 
amino acid (SEQ.ID.Na: 100) sequences for human H9 were thereafter detennined and 
20 verified (in the section below, both isofonns were mutated in accordance with the Human 
GPCR Proline Marker Algorithm). 

32. HB954 (GenBank Accession Number D38449) 

The cDNA for human HB954 was generated and cloned into pCMV expression 
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vector as follows: PGR was perfoimed using brain cDNA as template and rXth polymerase (Peridn 
Elmer) with the buffer system provided by the manufacturer, 0.25 |iM of each primer, and 0.2 mM 
of each of the 4 nucleotides. The cycle condition was 30 cycles of 94**C for I min, 58**C for Imin 
and 72**C for 2 min. The 5' PGR contained a Hindin site with the sequence: 
5 S'-TGGAAGCTTGGGGATGGGAGATAAGGGGAGGT -3' (SEQ.ID.NO.: 101) 
and the 3' primer contained an EcoRI site with the sequence: 
5'-CGTGAATTCCAAGAATTTACAATCCTTGCT-3' (SEQ.ID.NO.: 102). 
The 1,6 kb PGR fragment was digested with Hindlll and EcoRI and cloned into Hindlll- 
EcoRI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 103) and amino acid 

10 (SEQ.ID.no.: 104) sequences for human HB954 were thereafter determined and verified. 
33. HG38 (GenBank Accession Number: AF062006) 
The cDNA for human HG38 was generated and cloned into pCMV expression 
vector as follows: PGR was performed using brain cDNA as template and rTth polymerase (Peridn 
Ehner) with the buffer system provided by the manufacturer, 0.25 |iM of each primer, and 0.2 mM 

15 of each 4 nucleotides. The cycle condition was 30 cycles of 94**C for 1 min, 56°G for Imin and 
72 °C for 1 min and 30 sec. Two PGR reactions wctc performed to separately obtain the 5' and 
3 ' fragment. For the 5 ' fragment, the 5 ' PGR primer contained an Hindin site with the sequence: 
5'-GGGAAGGTTGGGGGAGGATGGAGAGGTGCG-3' (SEQ.ID.NO.: 259) 
and the 3* primer contained a BamHIsite with the sequence: 

20 5'-AGAGGATGGAAATGCAGAGGACTGGTAAGC-3' (SEQ.ID.NO.: 260). 

This 5M .5 kb PGR fragment was digested with Hindin and BamHI and cloned into an Hindlll" ' 
BamHI site of pGMV. For the 3' fragment, the 5' PGR primer was kinased with the sequence: 
5'-GTATAAGTGGGTTACATGGTTTAAG-3' (SEQ.ID.NO. 261) 
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and the 3' prima- contained an EcoRI site with the sequence: 

5'-TTTGAATTCACATATTAATTAGAGACATGG-3' (SEQ.ID.NO.: 262). 

The 1.4 kb 3' PGR fragment was digested with EcoRI and subcloned into a blunt-EcoRI site of 

pCMV vector. The5'and3'fragments were thenUgatedtogetherthroughacommon EcoRVsite 
5 to generate the fuU length cDNA clone. Nucleic add (SEQ.ID.NO.: 263) and amino acid 
(SEQ.ID.no.: 264) sequences for human HG38 were thereafter determined and verified. 

34. HM74 (GenBank Accession Number: D10923) 

The cDNA for human HM74 was generated and cloned imo pCMV expression 

vector as follows: PGR was performed using either genomic DNA or thymus cDNA (pooled) as 

10 template and rTth polymerase (Perkin Etaier) with the buffer system provided by the 

manufecturer. 0.25 nM of each primer, and 0.2 mM of each of the 4 nucleotides. Hie cycle 

condition was 30 cycles of: 94''G for 1 min; eS^C for Imin; and 72»G for 1 min and 20 sec. TTie 

5' PGR primer contained an EcoRI site with the sequence: 

5'-GGAGAATTGAGTAGGGGAGGGGGTGGATG-3'(SEQ.ID.NO.: 105) 
15 and the 3' primer was kinased with the sequence: 

5'-GGAGGATGGAGGAAAGGTTAGGGCGAGTGC-3'(SEQ.ID.NO.:106). 
The 1.3 kb PGR fragment was digested with EcoRI and cloned into EcoRI-Smal site of 
pCMV expression vector. Glones sequenced revealed a potential polymorphism involving a 
N94K change. Aside from this difference, nucleic acid (SEQ.ID.NO.: 107) and amino acid 
20 (SEQ.ID.no.: 108) sequences for human HM74 were thereafter determined and verified. 

35. MIG (GenBank Accession Numbers: AFO44600 and AFO44601) 
TTie cDNA for hinan MIG was generated and cloned into pCMV expression 
vector as follows: PGR was perfomied using genomic DNA as template and TaqPlus Precision 
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polymerase(Stratagene)forfinit round PCRorpfupolymerase(Stiatagene)forse^ 
with the buffer system provided by the manu&cturer, 0.25 iM of each primer, and 0.2 mM 
(TaqPlus Precision) or 0.5 mM (pfu) of each of the 4 nucleotides. When pfo was used, 10% 
DMSO was included in the buflfer. The cycle condition was 30 cycles of: 94''C for 1 min; 65°C 
5 for Imin; and 72 "C for: (a) 1 min for first round PCR; and (b) 2 min for second round PGR. 
Because there is an intron in the coding region, two sets of primers were separately used to 
generate overlapping 5 ' and 3' fragments. The 5' fragment PCR primers were: 

5'-ACCATGGCTTGCAATGGCAGTGCGGCCAGGGGGCACT-3' (external sense) 
(SEQ.ID.NO;: 109) 
10 and 

5'-CGACCAGGACAAACAGCATCTTGGTCACTTGTCTCCGGC-3 '(internal antisense) 
(SEQ.IDJ^IO.: 110). 

The 3' fragment PCR primers were: 

5'-GACCAAGATGCTGTTTGTCCTGGTCGTGGTGTTTGGCAT-3' (internal sense) 
15 (SEQ.ID.NO.: lll)and 

5'-CGGAATTCAGGATGGATCGGTCTCTTGCTGCGCCT-3' (external antisense with an 
EcoRI site) (SEQ.ID.NO.: 1 12). 

The 5' and 3' fragments were ligated togetha- by using the first round PCR as tenq)late and the 
kinased external soise primer and extanal antisense primer to perform second round PCR. The 
20 1 .2 kb PCR fi:agment was digested with EcoRI and cloned into the blunt-EcoRI site of pCMV 
©qjression vector. Nucleic acid (SEQJDJ40.: 113) and amino acid (SEQ.E).NO.: 114) 
sequences for human MIG were thereafter detemiined and voified. 

36. OGRl (GenBank Accession Number: U48405) 
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The cDNA for human OGRl was generated and cloned into pCMV expression 
vector as follows: PCR was perfomied using genomic DNA as template and rTth polymerase 

(PerkinElmer)withthebuflfer system provided by the manufacturer,0.25HM of each pim 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94«C for 1 min; 65°C 
5 for Imin; and 72T for 1 min and 20 sec. The 5' PGR primer was kinased with the sequence: 
5'-GGAAGCTTCAGGCCCAAAGATGGGGAACAT-3' (SEQ.ID.NO.: 115): 
and the 3 ' primer contained a BamHI site with the sequrace: 

5'-GTGGATCCACCCGCGGAGGACCCAGGCTAG -3' (SEQ.ID.NO.: 1 16). 
The 1 . 1 kb PCR fragment was digested with BamHI and cloned into the EcoRV-BamHI site 
10 of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 11 7) and amino acid (SEQ.ID.NO.: 
118) sequences for human OGRl were thereafter determined and verified. 
37. Serotonin SHTja 

The cDNA encoding endogenous human SHTja recq)tor was obtained by RT-PCR 
using human brain poly-A* RNA; a 5 ' primer from the 5' untranslated region with an Xho I 
15 restriction site: 

5'-GACCTCGAGTCCTTCTACACCTCATC-3' (SEQ.ID.NO: 119) 
and a 3' prim^ from the 3' untranslated region containing an Xba I site: 
5'-TGCTCTAGATTCCAGATAGGTGAAAACTTG-3' (SEQ.IDJ^O: 120) 
PCR was performed using either TaqPlus™ precision polymerase (Stiatagene) or rTth™ 
20 Polymerase(PerkinElmer)withthebufrersysteraprovidedbythemanufacturers,025MMofe^^ 
primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94«»C for 
lmin;57''Cforlmin;and72«Cfor2min. The 1.5 kb PCR fragment was digested with Xba I - 
and subcloned into Eco RV-Xba I site of pBluescript. TTie resulting cDNA clones were fully 
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sequenced and found to encode two amino acid changes from the published sequences. The first 
one was a T25N mutation in the N-teraiinal extracellular domain; the second is an H452Y 
mutation. Because cDNA clones derived from two independent PGR reactions using Taq 
polymerase from two different commercial sources (TaqPlus™ from Stratagene and rTdi™Perkin 
5 Ehner) contained the same two mutations, these mutations are likely to represent sequence 
polymorphisms rather than PGR errors. With these exceptions, the nucleic acid (SEQ.ID.NO.: 
121) and amino acid (SEQ.ID.NO.: 122) sequences for human SHTza were thereafter detennined 
and verified. 

38. Serotonin SHTjc 

10 The cDNA encoding endogenous human SHTjc receptor was obtained from 

human brain poly-A"" RNA by RT-PC31. The 5* and 3' primers were derived fit)m the 5' and 3' 
untranslated regions and contained the following sequences: 
5'-GACCTCGAGGTTGCTTAAGACTGAAGC-3' (SEQ.ID.NO.: 123) 
5'-ATTTCTAGACATATGTAGCTTGTACCG-3' (SEQ.1D.N0.: 124) 

15 Nucleic acid (SEQ.ID.NO.: 125) and amino acid (SEQ.ID.NO.: 126) sequences for human SHTjc 
were thereafter detennined and verified. 

39. V28 (GenBank Accession Number: U20350) 

The cDNA for human V28 was generated and cloned into pClVfV expression 
vector as follows : PGR was perfbimed using brain cDNA as tenq)late and rTth polymerase (Perkin 
20 Elmer) with the buffer system provided by the manufacturer, 0.25 |iM of each primer, and 0.2 mM 
of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94**C for 1 min; 65°C for Imin; 
and 72 for 1 min and 20 sec. The 5' PGR primer contained a HindlU site with the sequence: 
5'-GGTAAGCTTGGCAGTCCACGCCAGGCCTTC-3' (SEQ.ro.NO.: 127) 
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and the 3' primer contained an EcoRI site with the sequence: 
5'-TCCGAATTCTCrGTAGACACAAGGCnTGG-3' (SEQ.ID.NO.: 128) 
The 1.1 kb PGR fragment was digested with Hindm and EcoRI and cloned into HindDI-EcoRI 
siteofpCMV expression vector. Nucleic acid (SEQ.ID.NO.: 129) and amino acid (SEQJD.NO.: 
5 130) sequences for human V28 were thereafter determined and verified. 

Example 2 

Preparation of Non-Endogenous Human GPCRs 
1. Site-Directed Mutagenesis 

MutagenesisbasedupontheHumanGPCRProIineMarker approach disclosedherein was 
10 performed on the foregoing endogenous human GPCRs using Transformer Site-Directed 
Mutagenesis Kit (Clontech) according to the manufecturer instmctions. For this mutagenesis 
approach, a Mutation Probe and a Selection Marker Probe (unless otherwise indicated, the probe 
of SEQ.IDJ^^O.: 132 was the same throughout) were utilized, and the sequences of these for the 
specified sequences are listed below in Table B (the parenthetical number is the SEQ. ID.NO.). 

15 Forconvenience,thecodonmutationincorporatedirtothehumanGPCRisalsoiK)ted,instand^ 
form: 



Tables 



Receptor Identifier 
(Codon Mutation) 



2Q GPRl 
(F245K) 



GPR4 
(K223A) 



2t 



GPRS 
(V224K) 



Mutation Probe Sequence 
(5'-3') 
(SEQJDJVO.) 



GATCrcCAGTAGGCATAAGT 

GGACAATTCTGG 

(131) 



AGAAGGCCAAGA TCGCGC GG 

CTGGCCCTCA 

(133) 



CGGCGCCACXXiCACXj AAAA A 
GCTCATCTTC 



Selection Marker Probe 
Sequence (5'-3') 

(seq.id.no.) 



CrccrrCGGTCCTCCTATCGT 

TGTCAG/WVG 

(132) 



CTOCTIXXKyrCXrrcCTATCGT 
TGTCAGAAGT 



CTCXnTCGGTCCrCCTATOGT 
TGTCAGAAGT 
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(134) 




GPR7 
(T250K) 


GCCAAGAAGCGGGTGAAGTT 
CCTGGTGGTGGCA 

(135) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 




GPR8 

(T259K) 


CAGGCGGAAGGTGAAAGTCC 

TGGTCCrCGT 

(136) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


0GPR9 
(M254K) 


CGGCGCCTGCGGGCCAAGOG 

GCTGGTGGTGGTG 

(137) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


GPR9>6 
(L241K) 


CCAAGCACAAAGCCAAGAAA 

GTGACCATCAC 

(138) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 




GPRIO 
(F276K) 


GCGCCGGCGCACTAAATGCr 

TGCTGGTGGT 

(139) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 




GPR15 
(I240K) 


CAAAAAGCTGAAGAAATCTA 
AGAAGATCATCnTAlTGTCG 
(140) 


CTCCrTCGGTCCTCCTATCGT 
TGTCAGAAGT 


GPR17 
(V234K) 


CAAGACCAAGGCAAAACGCA 

TGATCGCCAT 

(141) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


1 GPR18 
(I231K) 


GTCAAGGAGAAGTCCAAAAG 

GATCATCATC 

(142) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


GPR20 
(M240K) 


CGCCGCGTGCGGGCCAAGCA 

GCTCCTGCrC 

(143) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


GPR21 
(A251K) 


CCTGATAAGCGCTATAAAAT 

GGTCCTGTTTCGA 

(144) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 






5( 


GPR22 
(F312K) 


GAAAGACAAAAGAGAGTCA 
AGAGGAIUILTITATTG 

(145) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


GPR24 
(T304K) 


CGGAGAAAGAGGGTGAAAC 

GCACAGCCATCGCC 

(146) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


GPR30 
(L258K) 


alternate approach; see below 


alternate approach; see below 




GPR31 
(Q221K) 


AAGCTTCAGCGGGCCAAGGC 

ACTGGTCACC 

(147) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


5i 


GPR32 
(K255A) 


CATGCCAA(XGGCCCGCGAG 

GCTGCTGCTGGT 

(279) 


ACCAGCAGCAGCCTTCGCGGG 

CCGGTTGGCATG 

(280) 




GPR40 
(A223K) 


CGGAAGCTGCGGGCCAAATG 

GGTGGCCGGC 

(265) 


CTCCTTCGGTC<7TCCTATCGT 
TGTCAGAAGT 




GPR41 


CAGAGGAGGGTGAAGGGGCT 
GTTGGCG. 


CrCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT | 
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(266) 




GPR43 
(V221K) 


GGCGGCGCCGAGCCAAGGGC 
CTGGCTGTGG 


I CTCcnUGGTCCnXTATCGT 
TGTCAGAAGT 


APJ 

! (L247K) 

"li^i ^ — 


alternate ^proach; see below 


^ alternate approach; see below 


(V258K) 


CAGCGGCAGAAGGCAAAAA 

GGGTGGCCATC 

(148) 


^ 1 1 1 i^uu 1 ^LTICCTATCGT 
TGTCAGAAGT 


CEPR 
(L258K) 


CGGCAGAAGGCGAAGCGCAT 

GATCCrCGCG 

(149) 


1 1 1 i rcCTATCGT 
TGTCAGAAGT 


OEBIl ' 
(I2fiaC) 


j GAGCGCAACAAGGCCAAAA " 
AGGTGATCATC 
(150) 


CTCci rcUiTCCrcCTATCGT 
TGTCAGAAGT 


EBI2 

(L243K) 


GGTGTAAACAAAAAGGCTAA 

AAACACAATTATTCTTATT 

(151) 


CTCCncGGrccTCCTATCGT 
TGTCAGAAGT 


ETBR-LP2 ~ 

(N358K) 


GAGAGCCAGCTCAAGAGCAC" 

CGTGGTG 

(152) 


CTCL 1 1 CGG rcCTCCTATCGT 
TGTCAGAAGT 


GHSR 
(V262K) 


CCACAAGCAAACCAAGAAAA 

TGCTGGCTGT 

(153) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


GPCR-CNS 
(N491K) 


CTAGAGAGTCAGATGAAGTG 

TACAGTAGTGGCAC 

(155) 


CTCCl I CGGTCCTCCTATCGT 
TGTCAGAAGT 


» GPR-NGA 
(I275K) 


CGGACAAAAGTGAAAACTAA 

AAAGATGTrCCTCATT 

(156) 


CTCCITCGGTCCTCCTATCGT 
TGTCAGAAGT 


n:7d ana jiyo 
(F236K) 


GCTGAGGTTCGCAATAAACT 

AACCATGTITGTG 

(157) 


CTCCI ICGGTCCTCCTATCGT 
TGTCAGAAGT 


HB954 
(H265K) 


GGGAGGCCGAGCTGAAAGCC 

ACCCTGCTC 

(158) 


^ i 1 uuu 1 CCTCCTATCGT 
TGTCAGAAGT 




HG38 
(V765K) 


GGGACTGCTCTATGAAAAAA " 

CACATTGCCCTG 

(268) 


^/v iv. AAO 1 o 1 A 1 CATGTGCC 

AAGTACGCCC 

(154) 




HM74 
(I230K) 


CAAGATCAAGAGAGCCAAAA 

CCTTCATCATG 

(159) 


c 1 cui 1 CGGTCCTCCTATCGT 
TGTCAGAAGT 


3(1 


MIG 
(T273K) 


COSGAGACAAGTGAAGAAG 

ATGCTGTITGTC 

(160) 


CTCCTTCGGTCCTCCTATCGT 




OGRl 

(Q227K) 

i 


GCAAGGACCAGATCAAGCGG 

CTGGTGCTCA 

[161) 


CTCCrrCGGTCCTCCTATCGT 
rCTCAGAAGT 


< 


Serotonin SHTja j 
'C322K) 


alternate approach; jee below ; 


alternate approach; below 


( 


Serotonin SHTjc i 
S310K) 


ilteniatc approach; see below i 
. 


iltemate approach; see below 
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V28 


CAAGAAAGCCAAAGCCAAG 


CTCCTTCGGTCCTCCTATCGT 


(I230K) 


AAACTGATCCTTCrG 


TGTCAGAAGT 


(162) 





The non-endogenous human GPCRs were then sequenced and the derived and verified nucleic 
acid and amino acid sequences are listed in the accompanying "Sequence Listing" appendix 
S to this patent document, as summarized in Table C below: 



Table C 



Mutated GPCR 


Nucleic Acid Sequence 
Listing 


Amino Acid Sequence 
Listing 


GPRl 
(F245K) 


SEQ.ID.no.: 163 


SEQ.ID.no.: 164 


V GPR4 

(K223A) 


SEQ.ID.no.: 165 


SEQ.ID.no.: 166 


GPRS 
(V224K) 


SEQ,ID.NO.: 167 


SEQ.ID.NO.: 168 


GPR7 

i (T250K) 


SEQ.ID.no.: 169 


SEQJD.no.: 170 


GPRS 
(T259K) 


SEQ.ID.no.: 171 


SEQ.ID.no.: 172 


GPR9 
(M254K) 


SEQ.ID.no.: 173 


SEQ.IDJNfO.: 174 


(1 GPR9-6 
(L241K) 


SEQ.ID.no.: 175 


SEQ.ID.no.: 176 


GPRIO 
(F276K) 


SEQ.ID.NO.: 177 


SEQJD.NO.: 178 


GPRI5 

t (I240K) 


SEQ.ID.no.: 179 


SEQ.1D.no.: 180 


GPRl 7 
(V234K) 


SEQ.ID.no.: 181 


SEQ.ID.no.: 182 


GPR18 

(123 IK) 


SEQ.ID.no.: 183 


SEQ.ID.N0.: 184 


fGPR20 
(M240K) 


SEQ.ID.NO.: 185 


SEQ.ID.no.: 186 


GPR21 
(A251K) 


SEQ.ID.no.: 187 


SEQJDJ40.: 188 


GPR22 
! (F312K) 


SEQ.ID.no.: 189 


SEQ.IDJ>IO.: 190 


GPR24 

(T304K)) 


SEQ.ID.no.: 191 


SEQ.ID.NO.: 192 


GPR30 


SEQ.ID.no.: 193 


SEQJD.no.: 194 
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(L258K) 
GPR31 
(Q221K) 


SEQ.ID.no.: 195 


SEQ.ID.NO.: 196 


GPR32 
t (K2SSA\ 


SEQ.E).NO.: 269 


SEQ.ID,NO.: 270 


GPR40 ' 


SEQ.ID.no.: 271 


SEQ.ID.no.: 272 


GPR41 

(A223K) 


SEQJD.no.: 273 


SEQ.lD2yJO.: 274 


10 GPR43 
(V221K) 


SEQ.ID.NO.; 275 


SEQ.IDm:276 


APJ 

(L247K) 


SEQ.ID.no.: 197 


SEQ.ID.no.: 198 


BLRl 

(V258K) 


SEQ.ID.N0.: 199 


SEQ.ID.no,: 200 


CEPR 


SEQ.ID.no.: 201 


SEQ.ID.no.: 202 


EBIl 
(1262fC\ 

2(rmi2 ~ — 


SEQJDJMO.: 203 


SEQ.ID.no.: 204 


(L243K) 


SEQ.ID.N0.: 205 


SEQ.ID.no.: 206 


ETBR-LP2 


SEQ.ID.N0.: 207 


SEQ.ID.no.: 208 


GHSR 
!i (V262K) 


SEQ.ID.N0.: 209 


SEQ.IDm: 210 


CTCR-CNS 
(N491K) 


SEQ.ID.N0.: 211 


SEQ.E).N0.: 212 


GPR-NGA 
(I275K) 


SEQ.ID.no.: 213 


SEQ.ID,NO.: 214 


(F236K) 


SEQ.ID.N0.: 215 


SEQ.ID.no.: 216 


H9b 
(F236K) 


SEQ.ID.no.: 217 


SEQ.ID.NO.:218 


HB954 
i (H265K) 
1^38 


SEQ.ID.no.: 219 


SEQ.IDJ^0.: 220 


(V765K) 


SEQ.1D.no.: 277 


SEQ.ID.no.: 278 


HM74 
(I230K) 


SEQ.ID.no.: 221 


SEQ.ID.no.: 222 


' MIG 
(T273K) 


SE0.1DJ^O.: 223 




OGRl ' 

(Q227K) 


SEQ.ID.no.: 225 


SEQ.ID.no.: 226 


Serotonin SHTza 

(C322K) 


SEQ.ID.no.: 227 


SEQ.ID.no.: 228 


Serotonin SHTjc 
(S310K) 


SEQ.ID.no.: 229 


SEQJD.no.: 230 


V28 
(I230K) 


SEQ.ID.no.: 231 


SEQ.ID.no.: 232 
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2. Alternate Mutation Approaches for Employment of the Proline Marker 
Algorithm: APJ; Serotonin SHTja; Serotonin SHTjc; and GPR30 

Although the above site-directed mutagenesis approach is particularly preferred, other 
approaches can be utilized to create such mutations; those skilled in the art are readily credited 
5 with selecting approaches to mutating a GPCR that fits within the particular needs of the artisan. 

a. APJ 

Preparation of the non-endogenous, human APJ receptor was accomplished by 
mutating L247K. Two oligonucleotides containing this mutation were synthesized: 
5'- GGCTTAAGAGCATCATCGTGGTGCTGGTG-3' (SEQ.ID.NO.: 233 ) 
10 5'-GTCACCACCAGCACCACGATGATGCTCTTAAGCC-3' (SEQ.ID.NO.: 234) 

The two oligonucleotides were annealed and used to replace the Nael-BstEII Segment of human, 
endogenous APJ to generate the non-endogenous, version of human APJ. 

b. Serotonin SHTj^ 

cDNA containing the point mutation C322K was constructed by utilizing the restriction 
1 5 enzyme site Sph I which encompasses amino acid 322. A primer containing the C322K mutation: 
5'-CAAAGAAAGTACTGGGCATCGTCTTCTTCCT.3' (SEQ.ID.NO: 235) 
was used along with the primer from the 3 ' untranslated region of the receptor: 
S'-TGCTCTAGATTCCAGATAGGTGAAAA CTTG-3' (SEQ.ID.NO.: 236) 
to p^orm PGR (under the conditions described above). The resulting PGR fi^gment was then 
20 used to replace the 3' end of endogenous SHTja cDNA through the T4 polymerase blunted Sph 
I site. 

c Serotonin SHT2C 
The cDNA containing a S3 1 OK mutation was constructed by replacing the Sty I restriction 
fragment containing amino acid 3 1 0 with synthetic double stranded oligonucleotides that encode 
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the desired mutation. Hie sense strand sequence utilized had the Mowing sequence: 

5'-CrAGGGGCACCATGCACX}CTATCAACAATGAAAGAAAAGCTAAGAAAGTC-3' 
(SEQ.ID.no.: 237) 

and the antisense strand sequence utilized had the foUowing sequence: 

i 5'<:AAGGACmCTTAGCTITrCmCATrGTrGATAGCCTGCATGGT^^^^ 
ID. NO.: 238) 

d. GPR30 

Priortogeneratingnon-endogenousGPR30, several independentpCR2.1/GPR30 isolates 
were sequenced in their entirety in onier to identify clones with no PCR-generated mutation 

clonehavingno mutations was digested wilhEcoRl and the endogenous GPR30cDNAfiagment 
was transferred into the CMV-driven expression plasmid pCI-neo (Promega), by digesting pCI- 
Neo with EcoRI and subcloning the EcoRI-liberated GPR30 fragment from pCR2.1/GPR30, to 

generatepCyGPR30.Thereafter,theleucineatcodon258wasmutatedtoalysineusinga^^^^ 
Change™ Site-Directed Mutagenesis Kit (Stratagene, #200518). according to manufecturer's 
instructions, and the following primas: - 

5'-CGGCGGCAGAAGGCGAAACGCATGATCCTCGCGGT-3' (SEQJD.NO.: 239) and 

5'-ACCGCGAGGATCATGCGTTTCGCCTrCTGC CGCCG-3' (SEQ.ID.NO.: 240) 
Example 3 

Receptor (Endogmous and Mutated) Expression 

Althoughavarietyofcellsareavailabletotheartfortheexpressionofprotein^,itismost 
preferred that mammalian cells be utilized. The primary reason for this is predicated upon 

practicalities. /.e.. utilization o£ yeast cells for the expression ofaGPCR, while "possible. 
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introduces into the protocol a non-mammalian cell which may not (indeed, in the case of 
yeast, does not) include the receptor-coupling, genetic-mechanism and secretary pathways that 
have evolved for mammalian systems - thus, results obtained in non-mammalian cells, while 
of potential use, are not as preferred as that obtained from mammalian cells. Of the 
mammahan cells, COS-7, 293 and 293T cells are particularly preferred, although the specific 
mammalian cell utilized can be predicated upon the particular needs of the artisan. 

Unless otherwise noted herein, the following protocol was utilized for the expression 
of the endogenous and non-endogenous human GPCRs. Table D hsts the mammalian cell and 
number utilized (per 150mm plate) for GPCR expression. 



Table D 



Receptor Name 


Mammalian Cell 


(Endogenous or Non- 


(Number Utilized) 


Endogenous) 




GPR17 


293 (2x 10^) 


GPR30 


293(4x10^) 


APJ 


COS-7 (5X10*) 


ETBR-LP2 


293 (1 X 10') 




2937(1x10') 


GJBR 


293(1 X 10') 




293T(1 X 10') 


MIG 


293(1 X 10') 


Serotonin SHTja 


293T(1 X 10') 


Sootonin SHTjc 


293T(lx 10') 



On day one, mammalian cells were plated out On day two, two reaction tubes were 
prqiared (the proportions to follow for each tube are per plate): tube A was prepared by mixing 
20^g DNA (e.g., pCMV vector; pCMV vector with endogenous receptor cDNA, and pCMV 
vector with non-endogenous receptor cDNA.) in Uml serum fiee DMEM (frvine Scientific, 
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mdne. CA); tube B was prepared by nuxing 120^ Upofectanurre (Gibco BRL) in 1 .2ml ser^ 
fiee DMEM. Tubes A and B were then admixed by inversions (several times), followed by 
incubation at room temperature for 3(M5min. The admixture is referred to as the Wection 
mixture". Plated cells were washed with IXPBS, followed by addition of 10ml serum fe« 
5 DMEM. 2.4mIofthetrBnsfection nuxture was then added to the cells, followed by incubation 
far4hrsat37W/.CO..Tlre^fectionmixnrr.wasthen^ovedbyaspir^^^ 
theadditionof25mlofDMEM/10o/oFetalBovineSer^. Cells werethenincubated^ 
CO^. After 72hrincubation,cellswerethenharvestedandutilizedforanalysis. 
1. Gi-Coupled Receptors: Co-Transfection with Gs-Coupled Receptors 

10 ^^^^^^^fGPRSOJthasbeendetemunedthatthisreceptorcouplesft^ 

Gi is known to inhibit the enzyme adenylyl cyclase, which is necessary for catalyzing the 
conversion ofATP to CAMP. T^tus, a non^ogenous,constitutively activated fom, of GPR30 
would beexpectedtobeassociatedwithdecreasedlevelsofcAMP. Assay confimmti^^ 
endogenous,constitutivelyactivatedfomiofGPR30directlyviameasureme^^ 
15 of^.wWleviable,canbepreferablymeasur^bycooperativeuseofaGs^^^^^ 

For example, a receptor that is Gs-coupled will stimulate adenylyl cyclase, and thus will be 
assodatedwithanincreaseincAMP. The assignee ofthe present application has discovered that 
theorphanreceptorGPR6isanendogenous.constitutivelyactivatedGPaL^^^ 
Gs protein. ITius when co-transfected, one can readily verily that a putative GPR30-mutation 
20 leads to constitutive activation thereof: an endogenous, constitutively activated 
GPR6/endogenous, non<onstitutively activated GPR30 cell will evidence an elevated level of 
CAMP when compared with an endogenous, constitutively active GPR6/non-endogenous. 
constitutively activated GPR30 (the latter evidencing a comparatively lower level of cAMP). 
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Assays that detect cAMP can be utilized to detemiine if a candidate compound is e.g., an inverse 
agonist to a Gs-associated receptor (/.e.. such a compound would decrease the levels of cAMP) or 
a Gi-associated receptor (or a Go-associated receptor) (/.e., such a candidate compound would 
increase the levels of cAMP). A variety of approaches known in die art for measuring cAMP can - 
5 be utilized; a preferred approach relies upon the use of anti-cAMP antibodies. Anodier approach, 
and most prefened, utilizes a whole cell second messenger reporter system assay. Promoters on 
genes drive the expression of the proteins that a particular gene oicodes. Cyclic AMP drives gene 
expression by promoting the binding of a cAMP-responsive DNA binding protein ortranscription 
factor (CREB) which then binds to die promoter at specific sites called cAMP response elements 
10 anddrivestiieexpressionofthegaie. Reportersystems can be constructed which have apromoter 
containing multiple cAMP response elements before the reporter gene, e.g., p-galactosidase or 

lucifCTase. Thus, an activated receptor such as GPR6 causes the accumulationofcAMPvv*ich then 
activates the gene and isxpression of die rqxjrter protein. Most preferably, 293 cells are co- 
transfected widi GPR6 (or another Gs-Unked receptor) and GPR30 (or anodier Gi-linked nceptor) 
15 plasmids, preferably in a 1:1 ratio, most preferably in a 1:4 ratio. Because GPR6 is an 

«idogenous,constitutively active receptor that stimulates die productionofcAMP,GPR6 strongly 
activates the reporter gene and its expression. The rqjortw protein such as P-galactosidase or 
luciferase can dien be detected using standard biochemical assays (Chen et al. 1995). Co- 

transfectionofendogaious,constitutivelyactiveGPR6widiaidogenous,non-constitutively active 
20 GPR30 evidences an increase in die luciferase reporter protein. Conversely, co-transfection of 
endogaious, constitutively active GPR6 witii non-endogenous, constitutively active GPR30 
evidences a drastic decrease in expression of luciferase. Several reporter plasmids are known and 
available in die art for measuring a second messraiger assay. It is considered well within die 
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sldlledaitisantodetemuneanappn,priatereporterp^^ 

primarily upon the particular need of the artisan. Although a variety of cells axe available for 
expression, mammalian cells ar« most piefetred. and of these types, 293 cells are most preferml. 
293 cells were Iransfected with the reporter plasmid pCRE-Luc/GPR6 and non-endogenous, 
5 constitutively activated GPR30 using a MammaUan Transfection™ Kit (Stratagene, #200285) 
CaPO, precipitation protocol according to the manufacturer's instructions {see, 28 Genomics 347 

(1995) forthepubUshed endogenous GPR6 sequence). TT,eprecipitatecontained4^^ 
80ng CMV-expression plasmid (having a 1 :4 GPR6 to endogenous GPR30 or non-endogenous 
GPR30 ratio) and 20ng CMV-SEAP (a transfection control plasmid encoding secr^ alkaline 
10 phosphatase). 50% of the precipitate was split into 3 wells of a 96.well tissue culture dish 
(containing 4X10^ ceUs/well); the remaining 50% was discarded. Tlie following monung. the 
media was changed. 48 hr after the start of the transfection, ceUs were lysed and examined for 
ludferase activity using a Luclite™ Kit (Packard, Cat. # 601 691 1) and Tril^ 
liquid scintillation and luminescence counter (WaUac) as per the vendor's instmctions. Hie data 
15 were analyzed using GraphPad Prism 2.0a (GraphPad Software Inc.). 

With respect to GPRl 7, which has also been detemiined to be Gi-linked, a modification 
of the foregoing ^ach was utilized, based upon, inter alia, use of another Gs-linked 
endogenous receptor, GPR3 (see 23 Genomics 609 (1994) and 24 Genomics 391 (1994)). Most 
preferably, 293 cells are utilized. These ceUs were plated-out on 96 well plates at a density of 2 
20 X 10^ceUsperweUandweretransfectedusingLipofectamineReagent(BRL)thefollowi^ 
according to manufacturer instmctions. A DNA/Hpid mixture was prepared for each 6-well 

tiansfection as foUows: 260ng of plasmid DNA in 1 00^1 of DMEM were gent^^ 

of lipid in loom of DMEM (the 260ng of plasmid DNA consisted of 20Qng of a 8xCRE-Luc 
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reporterplasmid (see below), 50ng ofpCMV comprising endogenous recqjtor or non-endogenous 
recq)tor or pCMV alone, and lOng of a GPRS expression plasmid (GPRS in pcDNA3 
(Invitrogen)). The 8XCR£-Luc reporter plasmid was prepared as follows: vector SRIF-p-gal was 
obtained by cloning the rat somatostatin promoter (-71/+51) at BglV-Hindm site in the pPgal- 
5 Basic Vector (Clontech). Eight (8) copies of cAMP reqwnse elemoit were obtained by PGR fiom 
an adenovirus template AdpCF126CCRE8 (^ee 7Human Gene Therapy 1883 (1996))and cloned 

intotheSRIF-p-galvectorattheKpn-BglVsite,resultinginthe8xCRE-p-galreportervector. Hie 
8xCRE-Luc reporter plasmid was generated by replacing the beta-galactosidase gene in the 
8xCRE-P-gal reporter vector with the luciferase gene obtained fiom the pGL3-basic vector 

10 (Promega) at the Hindm-BamHI site. Following SOmin. incubation at room temperature, the 
DNA/lipid mixture was diluted with 400 ^I of DMEM and 1 00)xl of the diluted mixture was added 
to each well. 100 ^1 of DMEM with 10% PCS woe added to each weU after a 4hr incubation in 
a cell culture incubator. The next morning the transfected cells were changed with 200 jil/well of 
DMEM with 1 0% PCS. Eight (8) hours later, the wells were changed to 1 00 ^l /well of DMEM 

1 5 without phenol red, after one wadi with PBS. Luciferase activity were measured the next day 

using theLucLite™rq)ortergaieassay kit (Packard) foUowingmanufacturerinstmctionsandiead 
on a 1450 MicroBeta™ scintillation and luminescence counter (Wallac). 

Figure 4 evidences that constitutively active GPR30 inhibits GPR6-mediated 
activation of CRE-Luc reporter in 293 cells. Luciferase was measured at about 4.1 relative 
20 light units in the expression vector pCMV. Endogenous GPR30 expressed luciferase at about 
8.5 relative light units, whereas the non-endogenous, constitutively active GPR30 (L258K), 
expressed luciferase at about 3.8 and 3. 1 relative light units, respectively. Co-transfection of 
endogenous GPR6 with endogenous GPR30, at a 1 :4 ratio, drastically increased luciferase 
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expressiontoabout 104.1 relative light units. Co-transfection of endogenous GPR6 with non- 
endogenous GPR30 (L258K), at the same ratio, drastically decreased the expression, which 
is evident at about 18.2 and 29.5 relative light units, respectively. Similar results were 
observed with respect to GPR17 with respect to co-transfection with GPR3, as set forth in 
5 Figure 5. 

Example 3 

ASSAYS For determination of Constitutive Activity 
OF Non-Endogenous GPCRs 

A. Membrane Binding Assays 
^0 1- P^SJGTP/S Assay 

WhenaGprotein-coupledieceptor is initsactive state, either asaresultofU^^ 
or constitutive activation, the receptor couples to a G protein and stimulates the release of GDP 
and subsequent binding of GTT> to the G protein. The alpha subunit of the G protein-receptor 
complex acts as a GTPase and slowly hydrolyzes the GTP to GDP. at which point the receptor 
15 nomially is deactivated. Constitutively activated receptors continue to exchange GDP for GTP. 
Thenon-hydrolyzableGTPanalog.[-S]GTPyS,canbeutili^ to demonstrated 
of p^SJGTPyS to membranes expressing constitutively activated receptors. Tlie advantage of 

usingPS]GTPKSbindingtomeasureconstitutiveactivationisthat:(a)k 

toallGprotein-coupledieceptors;(b)itisproximalatthemembranesurfecema^^ 
20 to pick-up molecules which affect the intraceHular cascade. 

The assay utilizes the ability of G protein coupled receptors to stimulate [«S]GTPyS 
binding to membranes expressing the relevant receptors. The assay can, therefore, be used in 
the direct identification method to screen candidate compounds to known, orphan and 
constitutivelyactivatedGprotein.coupledreceptors.Tl,eassayisgeneric^^ 
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to drug discovery at all G protein-coupled receptors. 

The p^S]GTPyS assay was incubated in 20 mM HEPES and between 1 and about 20mM MgClj 
(this amount can be adjusted for optimization of results, although 2{)mM is preferred) pH 7.4, 
binding buffer with between about 0.3 and about 1 .2 nM pSJGTPyS (this amount can be adjusted 
5 for optimization of results, although 1.2 is prefeired ) and 12.5 to 75 ng membrane protein (e.g. 
COS-7 cells expressing the receptor, this amount can be adjusted for optimization, although 75^g 
is preferred) and 1 (iM GDP (this amount can be changed for optimization) for 1 hour. 
Wheatgerm agglutinin beads (25 ^l; Amersham) were then added and the mixture was incubated 
for another 30 minutes at room temperature. The tubes were then centrifuged at 1500 x g for 5 
10 minutes at room tanperature and then counted in a scintillation counter. 

A less costly but equally applicable alternative has been identified which also meets the 
needs of large scale screening. Flash plates™ and Wallac™ scintistrips may be utilized to fonnat 
a high throughput [^^SJGTPyS binding assay. Furthermore, using this technique, the assay can be 
utilized for known GPCRs to simultaneously monitor tritiated ligand binding to the receptor at the 
1 5 same time as monitoring the efficacy via p^sjGTPyS binding. This is possible because the Wallac 
beta counter can switch energy windows to look at both tritium and ^^S-labeled probes. This assay 
may also be used to detect other types of membrane activation events resulting in receptor 
activation. For example, the assay may be used to monitor phosphorylation of a variety of 
receptors (both G protein coupled and tyrosine kinase receptors). When the membranes are 
20 centrifuged to the bottom of the well, the bound PS]GTPyS or the ^^P-phosphoiylated receptor 
will activate the scintillant which is coated of the wells. Scinti® strips (Wallac) have been used to 
demonstrate this principle. In addition, the assay also has utility for measuring ligand binding to 
receptors using radioactively labeled Ugands. In a similar manner, when the radiolabeled bound 
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ligand is centrifiiged to the bottom of the weU, the scintistrip label comes into proximity with the 
radiolabeled ligand resulting in activation and detection. 

Representative residts of graph comparing Control (pCMV). Endogenous AP J and Non- 
Endogenous APJ, based upon the foregoing protocol, are set forth in Figure 6. 
5 2. Adenylyl Cyclase . 

A Flash Plate™ Adenylyl Cyclase kit (New England Nuclear, Cat. No. SMP004A) 
designed for cell-based assays was modified for use with crude plasma membranes. Ihe Flash 

Plate weUscontainascintillantcoatingwhich also containsaspedficantibodyrecognizing CAMP. 
The CAMP generated in the wells was quantitated by a direct competition for bindmg of 

10 radioactive CAMP tracer to the cAMP antibody. The following serves as a biiefprotocol for the 
measurement of changes in cAMP levels in membranes that express the receptois. 

Transfected cells were harvested approximately three days after transfection. Membranes 
were prqiared by homogenization of suspended cells in buffer containing 20niM HEPES, pH 7.4 
and lOmM MgClj. Homogenization was performed on ice using a Brinkman Polytron™ for 

15 approximately 10 seconds. The resulting homogenate was centrifuged at 49,000 X g for 15 
minutes at 4*'C. The resulting pellet was then resuspended in buffer containing 20mM HEPES, 
pH 7.4 and 0. 1 mM EDTA, homogenized for 10 seconds, foUowed by centiifugation at 49,000 X 
g for 15 minutes at 4°C. The resulting pellet can be stored at -80°C until utilized. On the day of 

measurement, fliemembranepellet was slowly thawedatrxwm temperature, resuspended inbuffer 
20 containing20mMHEPES.pH7.4and 10mMMgCL,(tiiese amounts can be optimized, altiiough 
die values listed herein are prefereed), to yield a final protein concentration of 0.60mg/ml (die 
resu^ended membranes were placed on ice until use). 

cAMP standards and Detection BuflFer (comprising 2 ^Ci of tracer cAMP (100 ^l] to 
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1 1 ml Detection Buffer) were prepared and maintained in accordance with the manufacturer's 
instructions. Assay Buffer was prepared fresh for screening and contained 20mM HEPES, pH 7.4, 
IQmM MgCla, ZOmM (Sigma), O.l units/ml creatine phosphokinase (Sigma), 50 fxM GTP 
(Sigma), and 0.2 mM ATP (Sigma); Assay Buffer can be stored on ice until utilized. The assay 
5 was initiated by addition of 50ul of assay buffer followed by addition of 50ul of membrane 
suspension to the MEN Flash Plate. The resultant assay mixture is incubated for 60 minutes at 
room temperature followed by addition of 1 OOul of detection buffer. Plates are then incubated an 
additional 2-4 hours followed by counting in a Wallac MicroBeta scintillation counter. Values of 
cAMP/well are extrapolated from a standard cAMP curve which is contained within each assay 
1 0 plate. The foregoing assay was utilized with respect to analysis of MIG. 
B. Reporter-Based Assays 

1 . CREB Reporter Assay (Gs-associated receptors) 
A method to detect Gs stimulation depends on the known property of the transcription 
factor CREB, which is activated in a cAMP-dependent manner. A PathDetect CREB trans- 
15 Reporting System (Stratagene, Catalogue # 219010) was utilized to assay for Gs coupled 
activity in 293 or 293T cells. Cells were transfected with the plasmids components of this 
above system and the indicated expression plasmid encoding endogenous or mutant receptor 
using a Mammalian Transfection Kit (Stratagene, Catalogue #200285) according to the 
manufacurer's instructions. Briefly, 400 ng pFR-Luc (luciferase reporter plasmid containing 
20 Gal4 recognition sequences), 40 ng pFA2-CREB (Gal4-CREB fusion protein containing the 
Gal4 DNA-binding domain), 80 ng CMV-receptor expression plasmid (comprising the 
receptor) and 20 ng CMV-SEAP (secreted alkaline phosphatase expression plasmid; alkaline 
phosphatase activity is measured in the media of transfected cells to control for variations in 
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tiansfection efficiency between samples) were combined in a calcium phosphate precipitate 
as per the Kit's instructions. Half of the precipitate was equally distributed over 3 wells in a 
96-well plate, kept on the cells overnight, and replaced with fresh medium the following 

moirtng. Forty-eight(48)hrafterthestartofthetransfection.cellsweretreatedandassayed 
5 for luciferase activity as set forth with resepct to the GPR30 system, above. This assay was 
used with respect to GHSR. 

2. API reporter assay (Gq-associated receptors) 

Ae method to detect Gq stimulation depends on the known property of Gq-dependent 

phospholipase C to cause the activation of genes containing API elements in their promoter. 

10 A Pathdetect AP-1 cis-Reporting System (Stratagene, Catalogue # 219073) was utilized 

following the protocl set forth above with respect to the CREB reporter assay, except that the 

components of the calcium phosphate precipitate were 410 ng pAPl-Luc, 80 ng receptor 

expression plasmid, and 20 ng CMV-SEAP. This assay was used with respect to ETBR-LP2 

C. Intracellular IP3 Accumulation Assay 
15 On day 1, cells comprising the serotonin receptors (endogenous and mutated) were 

plated onto 24 well plates, usually 1x10^ cells/welL On day 2 cells were tiansfected by firstly 

mixing 0.25ug DNA in 50 ul serumfiee DMEM/well and 2 ul lipofectamine in 50 fxl 

serumfree DMEM/well. The solutions were gently mixed and incubated for 15-30 min at 

room temperature. Cells were washed with 0.5 ml PBS and 400 iil of serum free media was 

20 mixed with the transfection media and added to the cells. The cells were then incubated for 

3-4 hrs at 37»C/5%CO, and then the transfection media was removed and replaced with 

Iml/well of regular growth media. On day 3 the cells were labeled with ^H-myo-inositol. 

Briefly.themedia was removed the cells were washed with 0.5 mlPBS.nienO.5 ml inositol- 
fiee/serumfree media ( GIBCO BRL) was added/well with 0.25 ^Ci of ^H-myo-inositol / well 
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and the cells were incubated for 16-18 hrs o/n at ST'C/S'/oCOi • On Day 4 the cells were 
washed with 0.5 ml PBS and 0.45 ml of assay medium was added containing inositol- 
free/serum free media 10 nM pargyline 10 mM lithium chloride or 0.4 ml of assay medium 
and 50 ul of lOx ketanserin (ket) to final concentration of lOtiM. The cells were then 
5 incubated for 30 min at 37°C. The cells, were then washed with 0.5 ml PBSand 200 ul of 
ft«sh/icecold stop solution (IM KOH; 18 mM Na-borate; 3.8 mM EDTA) was added/well. 
The solution was kept on ice for 5-10 min or until cells were lysed and then neutralized by 
200 Hi of fresh/ice cold neutralization sol. (7.5 % HCL). The lysate was then transferred into 
1.5 ml eppendorf tubes and 1 ml of chloroform/methanol (1 :2) was added/tube. The solution 
10 was vortexed for 15 sec and the upper phase was applied to a Biorad AG1-X8 anion 
exchange resin (100-200 mesh). Firstly, the resin was washed with water at 1:1.25 WA^ and 
0.9 ml of upper phase was loaded onto the column. The column was washed with 10 mis of 
5 mM myo-inositol and 10 ml of 5 mM Na-borate/60mM Na-formate. The inositol tris 
phosphates were eluted into scintillation vials containing 10 ml of scintillation cocktail with 
15 2 ml of 0.1 M formic acid/ 1 M ammonium formate. The columns were regenerated by 
washing with 10 ml of 0.1 M formic acid/3M ammonium formate and rinsed twice with dd 
H2O and stored at 4°C in water. 

Figure 7 provides an illustration of IP3 production from the human 5.HT2A receptor 
that incoiporates the C322K mutation. While these results evidence that the Proline Mutation 
20 Algorithm approach constitutively activates this receptor, for purposes of using such a 
receptor for screening for identification of potential therapeutics, a more robust difference 
would be preferred. However, because the activated receptor can be utilized forundwstanding 
and elucidating the role of constitutive activation and for the identification of compounds that 
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can be further examined, we believe that this difference is itself useful in differentiating 
between the endogenous and non-endogenous versions of the human 5HT,, receptor. 
D. Result Summary 

TTie results for the GPCRs tested are set forth in Table E wheie the Per-Cent Increase 
5 i»<««=atesthepercentagedifferenceinresultsobsenredforthenon-en^^^^ 

to theendogenousGPCR; these values an^followedby parenthetical indications as to^ 
assayutilized. AdditionaUy,theassaysytemutihzedisparentheticaUyhsted(an4in^ 
different Host Cells were used, both are listed). As these results M^^^^ 

beutilizedtodetemuneconstitutiveactivityofthenon-endogenousve,.ionsofthehm^ 
0 Those skiUed in the art. based upon the foregoing and with reference to information available to 
flie art, are creditied with theabiUty to selelect and/ot i 
suites the particualr needs of theinvestigator. 



: maximize a particular assay ^pioach that 



Table E 



Receptor Identifier 


Per-Cent Difference 


(Codon Mutation) 




GPR17 


74.5 


(V234K) 


(CRE-Luc) 


GPR30 


71.6 


(U58K) 


(CREB) 


APJ 


49.0 


(L247K) 


(GTPyS) 


ETBR-LP2 


48.4(APl-Luc-293) 


(N358K) 


61.1(AP1-Luc-293T) 




UHSR 


58.9(CREB-293) 


(V262K) 


35.6(CREB - 293T) 
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ma 


39 (cAMP) 


(I230K) 




Serotonin SHTja 


33.2 (IP3) 


(C322K) 




Serotonin SHTzc 


39.1(IP,) 


(S310K) 





Example 6 

Tissue Distribution of Endogenous Orpiian GPCRs 

Using a commercially available human-tissue dot-blot fomat, endogenous orphan GPCRs 
1 0 were probed for a determination of the areas where such receptors are localized. Excq)t as indicate 
below, the entire receptor cDNA (radiolabelled) was used as the probe: radiolabeled probe was 
generated using the complete receptor cDNA (excised from the vector) using a Prime-It IT" 
Random Primer Labeling Kit (Stratagene, #300385), according to manufacturer ' s instructions. 
A human RNA Master Blot™ (Clontech, #7770-1) was hybridized with the GPCR 
15 radiolabeled probe and washed under stringent conditions according manufacturer's 
instructions. The blot was exposed to Kodak BioMax Autoradiography film overnight at - 
80^C. 

Representative dot-blot format results are presented in Figure 8 for GPRl (8A), GPR30 
(8B), and APJ (8C), with results being summarized for all receptors in Table F 
20 Table F 



GPCR 


Tissue Distribution 
(highest levels, relative to other tissues in 
the dot-blot) 


GPRl 


Placenta, Ovary, Adrenal 
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10 



15 



20 



GPR5 



GPR7 



GPRS 



GPR9-6 



GPR21 



GPR22 



GPR31 



CEPR 



EBIl 



EBI2 



ETBR-LP2 



GPCR-CNS 



GPR-NGA 



HB954 



HM74 



MIG 
ORGl 



V28 



Broad; highest in Heart, Lung, Adrenal, 
Thyroid, Spinal Cord 



Placenta, Thymus, Fetal Thymus 
Lesser levels in spleen, fetal spleen 



Liver, Spleen, Spinal Coid. Placoita 



No expression detected 



rhymus. Fetal Thymus 
Lesso- levels in Small totestine 



t^pleen, Lymph Node. Fetal SpiP^n, r^*;. 

J ■ 



Broad 

Broad; very low abundance 



Heart, Fetal Heart 
Lesser levels in Brain 



Stomach 



Broad 



Spleen 



Stomach, Liver, Thyroid. Putain^ 



Pancreas 

Lesser levels in Lymphoid Tissues 



Lymphoid Tissues, Aorta, Lung, Spinal Cord 



Broad; Brain Tissue 



Brain 

Lesser levels in Testis. Placenta 



Pituitary 
Lesser levels in Brain 



Pituitary 



Aorta, Cerebellum 
Lesser levels in most other tissues 



Spleen, Leukocytes, Bonemairow, Mammary" 
Glands, Lung, Trachea 



Low levels in Kidney, Liver, Pancreas, Lung, 
Spleen 



Pituitary, Stomach, Placenta 



Brain, Spleen, Peripheral Leukocytes 



25 ^^^^P^'^*eforegoinginfoiniation,itisnotedthathumanGPC 
fordistn^ution indiseased tissue; comp^^^ 

can then be utilized to detemiine the potential for over-expression or under-expression of a 
particularreceptorinadiseasedstate. In those circumstances where it is desii^Ie to ut^ 
non^dogenous versions of the human GPCRs for the purpose of screening to directly identify 
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candidate compounds of potential therapeutic relevance, it is noted that inverse agonists are useful 
in the treatment of diseases and disorders where a particular human GPCR is over-expressed, 
whereas agonists or partial agonists are useful in the treatment of diseases and disorders where a 
particular human GPCR is under-expressed. 

As desired, more detailed, cellular localization of the recepotrs, using techniques well- 
known to those in the art {e,g., in-situ hybridization) can be utilized to identify particuat cells 
within these tissues where the receptor of interest is expressed. 

It is intended that each of the patents, plications, and printed publications mentioned in 
this patent document be hereby incorporated by reference in their entirety. 

As tiiose skilled in the art will appreciate, numerous changes and modifications may be 
made to the preferred embodiments of the invention without departing from the spirit of the 
invention. It is intended that all such variations fall within the scope of the invention. 

Although a variety of expression vectors arc available to those in the art, for purposes of 
utilization for both the endogenous and non-endogenous human GPCRs, it is most preferred that 
the vector utilized be pCMV. This vector has been deposited with tiie American Type Culture 
Collection (ATCC) on October 13, 1998 (10801 University Blvd., Manassas, VA 201 10-2209 
USA) under the provsions of the Budapest Treaty for the International Recognition of the Deposit 
of Microorganisms for the Purpose of patent Procedure. The vector was tested by the ATCC on 

. 1998 and detemiined to be viable on , 1998. The ATCC has assigned 

the foUovvdng deposit number to pCMV: . 
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CLAIMS 

What is claimed is: 
1. 



Aconstitutively active. non.endogenousvefsion of an endogenous human orphanGprotein- 
coupled.eceptor(GPCR)comprisi„gthefollowi„gammoacidresidues(carto^^^^^ 

terminus orientation)transversingthetransmembrane-6(TM6)and intracellular loopO^ 
of the non-endogenous GPCR: 



wherein: 



P'AAijX 

(1) P' is an amino acid residue located within the TM6 region of the non- 

endogenous GPCR, where P' is selected from the group consisting 
of (i) the endogenous orphan GPCR proline residue, and (ii) a non- 
endogenous amino acid residue other than proline; 

(2) AA,5are 15 amino acid residues selected from the group consisting 
of (a) the 15 endogenous amino acid residues of the endogenous 
orphan GPCR, (b) 15 non-endogenous amino acid residues, and (c) 
a combination of 15 amino acid residues, the combination 
comprising at least one endogenous amino acid residue of the 
endogenous orphan GPCR and at least one non-endogenous amino 
acid residue, excepting that none of the 15 endogenous amino acid 
residues that are positioned within the TM6 region of the GPCR is 
proline; and 

(2) X is a non-endogenous amino acid residue located within the 1C3 region 
of said non-endogenous GPCR. 



The non-endogenous human GPCR of claim 1 wherein P' is the endogenous proline 
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residue. 

3. The non-endogenous human GPCR of claiml wherein P* is a non-endogenous amino 
acid residue other than a proline residue. 

4. The non-endogenous human GPCR of claim 1 wherein AA,5 are the 1 5 endogenous 
amino acid residues of the endogenous GPCR. 

5. The non-endogenous human GPCR of claim 1 wherein X is selected from the group 
consisting of lysine, hisitidine, arganine and alanine residues, excepting that when the 
endogenous amino acid in position X of said endogenous human GPCR is lysine, X 
is selected from the group consisting of histidine, arginine and alanine. 

6. The non-endogenous human GPCR of claim 1 wherein X is a lysine residue, excepting 
that when the endogenous amino acid in position X of said endogenous human GPCR 
is lysine, X is an amino acid other than lysine. 

7. The non-endogenous human GPCR of claim 4 wherein X is a lysine residue, excepting 
that when the endogenous amino acid in position X of said endogenous human GPCR 
is lysine, X is an amino acid other than lysine. 

8. The non-endogenous, human GPCR of claim 1 wherein P* is a proline residue and X 
is a lysme residue, excepting that when the endogenous amino acid in position X of 
said endogenous human GPCR is lysine, X is an amino acid other than lysine. 

9. A host cell comprising the non-endogenous human GPCR of claim 1 . 

1 0. The material of claim 9 wherein said host cell is of mammalian origin. 

1 1 . The non-endogenous human GPCR of claim 1 in a purified and isolated form. 

12. A nucleic acid sequence encoding a constitutively active, non-endogenous version of 
an endogenous human orphan G protein-coupled receptor (GPCR) comprising the following 
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nucldcacidsequenceregiontr^er.ingthetransmembrane-6 
(IC3) regions of the orphan GPCR: 

3'-P«-(AA.codon)„X.^-5' 

wherein: 

5 (1) P=«^ is a nucleic acid encoding region within the TM6 region of the 

non-endogenous GPCR, where P-^ encodes an amino acid selected 
from the group consisting of (i) the endogenous GPCR proline residue, 
and (ii) a non-endogenous amino acid residue other than proline; 
(2) (AA-codon).5arel5codonsencodingl5aminoacidresiduesselected 
^"^ ^ g""P consistmg of (a) the 15 endogenous amino acid 
residues of the endogenous oiphan GPCR, (b) 15 non-endogenous 
amino acid residues, and (c) a combination of 1 5 amino acid residues, 
the combination comprising at least one endogenous amino acid 
residue of the endogenous oiphan GPCR and at least one non- 
endogenous amino acid residue, excepting that none of the 15 
endogenous amino acid residues that are positioned within the TM6 
region of the oiphan GPCR is proUne; and 
(3) Xe«to is a nucleic acid encoding region residue located within the IC3 
region of said non-endogenous human GPCR, where encodes a 

20 

non-endogenous amino acid. 

13. The nucleic acid sequence of claim 12 wherein P"-- encodes an endogenous proline 
residue. 

14. The nucleic acid sequence of claim 12 wherein P«-» encodes a non-endogenous 
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amino acid residue other than a proline residue. 

15. The nucleic acid sequence of claim 12 wherein X^odon encodes a non-endogenous 
amino acid selected from the group consistmg of lysine, histidine, argmine and 
alanine, excepting that when the endogenous amino acid in position X of said 
endogenous human GPCR is lysine, Xcodon encodes an amino acid selected from the 
group consisiting of histidine, arginine and alanine. 

16. The nucleic acid sequence of claim 13 wherein X^^ion encodes a non-endogenous 
lysine amino acid excepting that when the endogenous amino acid in position X of 
said endogenous human GPCR is lysine, X^^ encodes an amino acid selected from 
the group consisiting of histidine, arginine and alanine. 

17. The nucleic acid sequence of claim 12 wherein is selected from the group 
consisting of AAA, AAG, GCA, GCG, GCC and GCU. 

18. The nucleic acid sequence of claim 12 wherein X,„don is selected from the group 
consisting of AAA and AAG, 

19. The nucleic acid sequence of claim 12 wherein P«**" is selected from the group 
consisting of CCA, CCC, CCG and CCU, and Xcodon is selected from the group 
consisting of AAA and AAG. 

20. A vector comprising the nucleic acid sequence of claim 12. 

21 . A plasmid comprising the nucleic acid sequence of claim 12. 

22. A host cell comprising the nucleic acid sequence of claim 2 1 . 

23. The nucleic acid sequence of claim 1 2 in a purified and isolated fonn. 

24. A method for selecting for alteration an endogenous amino acid residue within the 
third intracellular loop of a human G protein-coupled receptor ("GPCR"), said receptor 
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comprising a transmembrane 6 ..gion and an intracellular loop 3 region, which endogenous 

aminoacid,whenalter^toanon-endogenousaminoacid,constimtivelyactivatessaidhum^ 
GPCR, comprising the following steps: 

(a) identifyinganendogenousprolineresiduewitU^ 
5 of a human GPCR; 

(b) identifymg, by moving in a dir«:tion of the carboxy-tenninus region of said 

GPCRtowardstheamino-terminusregionofsaidGPCR,theendogenous.l6* 
amino acid residue from said proline residue; 

(c) altering the endogenous residue of step (b) to a non-endogenous amino acid 
10 residue to create a non-endogenous version of an endogenous human GPCR; 

and 

(d) determining whether the non-endogenous human GPCR of step (c) is 
constitutively active. 



15 



25. Themethodofclaim24whereintheaminoacidi^siduethatistworesiduesfromsaid 
proline residue in the transmembrane 6 region, in a carboxy-terminus to amino- 
terminus direction, is tryptophan. 



26. 



27. 



A constitutively active, non-endogenous human GPCR produced by the process of 
claim 24. 



A constitutively active, non-endogenous human GPCR produced by the process of 
20 claim 25. 

28. Analgorithmicapproachforcreatinganon-endogenous,constitutivelyactiveversion 
of an endogenoushumanGproteincoupledreceptor (GPCR), said endogenous GPCR 
comprising a transmembrane 6 region and an intracellular loop 3 region, the 
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algorithmic approach comprising the steps of: 

(a) selecting an endogenous human GPCR comprising a proline residue in the 
transmembrane-6 region; 

(b) identifying, by countmg 16 amino acid residues from the proline residue of 
step (a), in a carboxy-terminus to amino-terminus direction, an endogenous 
amino acid residue; 

(c) altering the identified amino acid residue of step (b) to a non-endogenous 
amino acid residue to create a non-endogenous version of the endogenous 
human GPCR; and 

(d) determining if the non-endogenous version of the endogenous human GPCR 
of step (c) is constitute vely active. 

29. The algorithmic approach of claim 28 wherein the amino acid residue that is two 
residues fix)m said proline residue in the transmembrane 6 region, in a carboxy- 
terminus to amino-terminus direction, is tryptophan. 

30. A constitutively active, non-endogenous human GPCR produced by the algorithmic 
approach of claim 28. 

31. A constitutively active, non-endogenous human GPCR produced by the algorithmic 
approach of claim 29. 

32. A method for directly identifying a compound selected from the group consisting of 
inverse agonists, agonists and partial agonists to a non-endogenous, constitutively 
activated human G protein coupled receptor, said receptor comprising a 
transmembrane-6 region and an intracellular loop-3 region, comprising the steps of: 
(a) selecting an endogenous human GPCR; 
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36. 
37. 



(c) 



(d) 
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(b) identifying a proline residue within the transmembrane-6 region of the GPCR 
of step (a); 

identifying, in a carboxy-temiinus to amino-tenninus direction, the 
endogenous.l6* amino acid residue from the proline residue of step (b); 
altering the endogenous amino acid of step (c) to a non-endogenous amino 
acid; 

(e) confirming that the non-endogenous GPCR of step (d) is constitutively active; 

(f) contacting a candidate compound with the non-endogenous, constitutively- 
activated GPCR of step (e); and 

determining, by measurement of the compound efficacy at said contacted 
receptor, whether said compound is an inverse agonist, agonist or partial 
agonist of said receptor. 

33 . The method of claim 32 wherein the non-endogenous amino acid of step (d) is lysine. 

34. A compound directiy identified by the method of claim 32. 
The method of claim 32 wherein the directly identified compound 



10 (g) 



IS an inverse 
agonist. 



The method of claim 32 wherein the directiy identified compound is an agomst.- 
The metiiod of claim 32 wherein the directiy identified compomid is a partial agonist 
38. A composition comprising tiie inverse agonist of claim 35. 
20 39. A composition comprising tiie agonist of claim 36. 

40. A composition comprising tiie partial agonist of claim 37. 

A metiiod for directly identifying an inverse agonist to a non-endogenous. 



41. 
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constitutively activated human G protein coupled receptor ("GPCR"), said GPCR comprising 
a transmembrane-6 region and an intracellular loop-3 region, comprising the steps of: 

(a) selecting an endogenous human GPCR; 

(b) identifying a proline residue within the transmembrane-6 region of the GPCR of 
step (a); 

(c) identifying, in a carboxy-terminus to amino-terminus direction, the 
endogenous, 16*^ amino acid residue from the proline residue of step (b); 

(d) altering the endogenous amino acid of step (c) to anon-endogenous lysine residue; 

(e) confirming that the non-endogenous GPCR of step (d) is constitutively active; 

(f) contacting a candidate compound with the non-endogenous, constitutively- 
activated GPCR of step (e); and 

(g) determming, by measurement of the compound efficacy at said contacted receptor, 
whether said compound is an inverse agonist of said receptor. 

42. An inverse agonist directly identified by the method of claim 37. 

43. A composition comprising an inverse agonist of claim 38. 
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pCMV Sequence and Restriction Site 



PstI 
Aval 
Ndl 



£coRV 

Wind III £coR I 

• • • 



Ndl 
.Sma I 

BamHtSpel Xba i 



^1 

i m I .Sac II 

>1aelll iBstXI Saet 

11 I 

AAGCTTGATATCGAATTCCTGCAGCCCG6G GGATCCACTAGTTCTAGAGCGGCCBCCACCGCG6TGSAGCTCCAeCTTTT 

' ' ' ' ' ' ' ' ' ' ' ' ' u 



TTCGAACTATAGCTTAAGGACGTCBGGCCCCCTASGTGATCAAGATCTCGCCGGCfiGTGCCGCCACCTCGACGTCGAAAA 

KLDIEFLOPGGSTSSRAA A TAVELQLL 
SL I SNS CSP 80P LVL ERPPPR W S S S F 
P A . r R 1 P A A R G I H . F . S G R H R G G A P A F 

' ' ' ' ' ' ' ' I ■ ■ ' I 1 1 1 . I , H , I 

LSSISNRCGPPOVLELAAAVATSSWSK 
L K I 0 F E Q L G P S G S T R S R G G 6 R H L E L K 0 
A 0 y R 1 G A A R P I W . N . L P R W R P > a G A K 



80 



GTTCCCTTTAGTGAGGGTTAATTGCGCGCTAGAGGATCTTTGTGAAeSAACCTTACTTCTGTGGTGTGACATAA TTGGAC 
CAAGGCAAATCACTCCCAATTAACGCeCGATCTCCTAGAAACACTTCCTTGGAATGAAGACACCACACTGTATTAACCTG 

F P L V R V N C A L E D L C E G T L L L W C 0 I I 6 
«=SL. . GLIAR. RIFVKEPYFCGV T L D 
V P F S EG. L R A RGSL. RNLTSVV. h'n w T 

' ' ' ' ' ' ' I I 1 1 I I H , J 

NGKTLTLQASSSRQSPVICSRHHSMiPr 
E R • H P N 1 A R . L I K T F S G K 0 P T V y N S 
T G K L S P . N R A L P D K H L F R V E T T H C L Q V 

prel 

AAACTACCTACABAGATTTAAAGCTCTAAGGTAAATATAAAATTTTTAAGTGTATAATGTGTTAAACTACT 



f 160 



TTTGATGGATGTCTCTAAATTTCGA6ATTCCATTTATATTTTAAAAATTCACATATTACACAATTTGATCACTAAGATTA 

Q T TYRDLK L. GKYKIFIC. CIMC TT fi^w 
K L P T E I . S S K V N 1 K F L S V C V K L L 1 L I 
*» r L 0 R F K A L R . 1 . N F . V Y N V L N y . F . 



240 



• •« ^, ^ ^ • P L y L r K L H I I H . V V S E 

L S C V S I . L E L T F I F N K L T Y H T L S S l b i 
F . R C L N L A R L y I y F IC . T \ \ "t \ S ' % 'n 'k 



1 I 



TGTTTGTGTATTTTAGAncCAACCTATeeAACTSATeAA TGGGAGCAGTGGTGGAATGCCTTTAATGAGGAA 



AACCTGT 



ACAAACACATAAAATCTAABfiTTeGATACCTTGACTACTTACCCTCGTCACCACCTTACGGAAATTACTCCTTTTGG^ ' 



SACA 



C L CILDSNLWN..MGAVVECL erpv 
. V C V F . I P T y G T D E W E Q W W N A > 'n 'r E N L 
I. r V y F R F 0 P H E L H N G S S G G H % 'l \ \ \ \ 



I I I 



°T \ "t '« ^ ^ " ^ ^ " ' * T T S H R . H P F 6 t' 

^ ° ^ • I G V . P V S S H S C H H F A K L S S F % M 
NT y IC L N W G 1 S S I F P L L P P I G IC I L % V a 



HGURE 3/V 
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TTTGCTCACAAeAAATCCCATCTAGTGATCATGACCCTACTGCTGACTCTCAACATTCTACTCCTCCAAAAAAGAAGAGA 

I I I I I 1 ' I ■ ■ ■ > 1 1 ■ ' i I 1 ■ ■ ' I noo 

AAACGACTCTTCTTTACC67AGATCACTACTACTCCGATGACGACT6AGAGTTGTAAGATGAGGAGGTTTTTTCTTCTCT 

LLRRNAI. . . • GYC. LS7FYSSKK EE 
FCSEEMPSSDOEATADSQHSTPPKKKR 
FAQKKCHLVMMRLLLTLNILLLQKRRE 

I I 1 ' I I 1 I I ' ' I ' 1 i 1 1 1- 

KSLkFAM. HHHP. QQSEVN. EELFSSF 
QESSI60LS5SAVASE. CEVGGFFFL 
KA. F F HWRTI ILSSSVRLnRSRWFLLS 

:Styi 

i 

AA6GTAGAAGACCCCAAGGACTTTCCTTCAGAATTGCTAAGTTTTTTGAGTCATGCTGT6TTTAGTAATAGAACTCTTGC 

■■ ' 1 I 1 I I 1 < J < 1 1 I I I I I IJ80 

TTCCATCTTCTGGGGTTCCTGAAAGGAAGTCTTAACGATTCAAAAAACTCAGTACGACACAAATCATTATCTTGAGAACG 

KGRRFQGLSFRIAKFFESCCV. . . NSC 
KVEDPKDFPSELLSFLSHAVFSNRTLA 
R. KTPRTFL QNC, VF. VMLCLVIELL 

I I 1 - I ■ I ■ ■ I I I ■ ■ 1 I I I I 1 1 : I 

PLLGVPSEKLIALNKSDHQT. YYFEQ 
FTSSGLSKGESNSLKKL/ATNL LLVRA 
LYFVGLVKR. FQ. TKQTflS HKTISSK S 

TTGCTTTGCTATTTACACCACAAAGGAAAAAGCTGCACT6CTATACAAGAAAATTATGGAAAAATATTCTGTAACCTTTA 

— i 1 I III I I i ■ I t ■ 1 I I I I 560 

AAC6AAACGATAAATGTGGTGTTTCCTTTTTCGACGTGAC6ATATGTTCTTTTAATACCTTTTTATAAGACATTG6AAAT 

LLCYLHHK6KSCTAIQENYGKIFCNLY 
CFAI YTTKEKAALLYKKIMEKYSVTF 
LALLFTPQRKKLHCY T R KLWKNIL. PL 

1 I 1 1 ' I I I » 1 1 1 1 I I J 

KSQ. KCWLPFLQVAICSF. PFI NQLR, 
OKAI. VVFSF AASSYLFIISFYETVKI 
AKSNVGCLFFSCQ. VLFNHFF'IRYGK 

.Asel 

I 

TAAGTAGGCATAACAGTTATAATCATAACATACTGTTTTTTCTTACTCCACACAGGCATAGAGTGTCTGCTATTAATAAC 

1 \ 1 1 ' 1 ' 1 ' 1 1 ' I 1 I I I ■ ■ I I 640 

ATTCATCCG7ATTGTCAATATTA6TATTGTATGACAAAAAAGAATGAGGT6TGTCCGTATCTCACAGAC6ATAATTATTG 

K. A . QL. S. HTVFSY STQA. SVCY. . 
ISRHNSYNHN I LFFLTP HRHRVSA I NN 
. VGITVI I ITYCFFLLHTGIECLLL IT 

— H 1 ' I I ' I I 1 I 1 1 I 1 ' ' 1 M ■ ■ i 

LYAYCNYDYCVTK E . EVCAYLTQ. . YS 
LLCLL. L. LH SNK R VGCLCLTDAILL 
YTPUVTI IMVYQKKKSWVPHSHRSNIV 

Rsal 

TATGCTCAAAAATTGTGTACCTTTAGCTTTTTAATTTGTAAAGGGGTTAATAAGGAATATTTGATGTATAGTGCCTTGAC 

' 1 > I ' I I I I I I I 1 I I I I 720 

ATACGAGTTTTTAACACATGGAAATCGAAAAATTAAACATTTCCCCAATTATTCCTTATAAACTACATATCACGGAACTG 

L CSKIVYL. LFNL. RG. . GIFOV. CLO 
YAOKLC TFSFL IC KGVNKEYLMYSALT 
MLKNCVP LAF. FVKGLIRNI. CIVP. 

' I i I ■ I I I I I ! I I ■ I I I I 
HE FITYR. SKLKYLP. YPINS TYHRS 
. A. FNHVKLKK l OLPTLLSYK lYLAKV 
ISLFQTGKAK. NT FPNILFIQ HITGQS 



HGURE 3E> 
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esaB I Pra I 

1 i 

TAGABATCATAATCASCCATAC CACATTTGTAGAGGTTTTACTTGCTTTAAAAAACCTCCCACACCTCCCCCTGAACCTG 
ATCTCTA6TATTAGTCGGTATG6TGTAAACATCTCCAAAATGAACGAAATTTTTTGGAGGGT6TGGAGG6GGACTTG6AC 

. RS. SAIPHL. RFYLL. KTSHTSP . T 
R DHNQPYH I CRGFTCFKKPPTPPPEP' 
LEI I JSHTTFVEVLLALKNLPHLPLNL 

1 J— H ! , 1 , H y^^^ , , , 1 , I 

. LDYDAMGCKYLN. ICS. FVEWVEGQVQ 
LS. L. GYWM0LPKVQKLFG GV6GGSGS 
SIMILWVVNTSTKSAKFFRGCRGRFR 

HInc II 

,Mfel ^Hpal 

AAACATAAAATGAATGCAATTGTTGTTGTTAACTTGTTTATTGCAGCTTATAATG5TTACAAATAAAGCAATAGCATCAC 

' ' ' ' ' ' I I • 1" ■ ■ I I 1 I . ago 

TTTGTATTTTACTTACGTTAACAACAACAATTGAACAAATAACGTCGAATATTACCAATGTTTATTTCGTTATCGTAGTG 

NIK. MOLLLL TC LLQLIMVTNKA IAS 
ET.NECNCCC.LVYCSL.WLQIKQ HH 
K H KHNAIVVVNLFIAAYNGYK. SNS IT 

~ ' 1 — « ' 1 ' 1 1 f 1 H 1— -I , H 

FHFHICNNNNVQKNC SI ITVFLAIADC 
VYFSHLQQO. ST. Q LKYHNCIF C Y C 
FCLIFAITT7 LKNIAA. LP. LYLLLfl'v 

Xbal 

aaatttcacaaataaagcatttttttcactgcattctagttgtggtttgtccaaactcatcaatgtatcttatcatgt'ct 

TTTAAAGTGTTTATTTCGTAAAAAAAGTCACBTAAGATCAACACCAAACAGGTTTGAGTAGTTACATAGAATAeTACAGA ^ 

•fPHK. SIFFTAF. LWFVQTHQCILSCL 
*i r K A F F S L H S S C G L S K L 1 N V S Y H V 

' ' ' • ' • ' 1 1 H- \ 1 1 1 , I 

I E C I F C K K . 0 M R T T T Q G F E D I Y R 1 M D 
«- N . L Y. L M K K V A N . N H N T W V H 1 K D H R 

F K V F L A N K E S C E L 0 P K D L S "« L T 0 T 

agatcttgtggaatgtgtgtcagttagggtgtggaaagtccccagcctccccagcaggcagaagtatgcaaagcatg'cat 



TCTAGAACACCTTACACACAGTCAATCCCACACCTTTCAG6GGTCCGAGGGGTCBTCCGTCTTCATACGTTTCGTACGTA 

R SCGMCVS. GVESPQAPQOAEVCKACI 
D L V E c V S V R V W K VP R L P S R Q K Y A K H A 

• ' W N V C 0 L G C G K S P G S P A G R S H D S H H 

' ' ' I I ■ I I . I ■ ■ 111 , . 

L DOPIHTL. PTSLGWAGWCA STHLAM H 
S R T S H T D T L T H F T G L S G L L C F Y A F C A D 
I «C H . F T H . N P H P F D G P E e A P L L 1 C L « C 



f 1040 



FIGURE 3 C 



wo 00/22129 g^^g PCT/US99/23938 



:Sphl 

JMsM 

CTCAATTAGTCA6CAACCAG6T&TGGAAA6TCCCCA6GCTCCCCAGCAGGCAGAAGTAT6CAAAGCAT6CATCTCAATTA 

' I ' ' ' 1 ' I ' ' I ' I ■ ■ I » K 1120 

6A6TTAATCAG7CGTTGGTCCACACCTTTCAG6GGTCCGAG6GGTCGTCCGTCTTCATACGTTTCGTACGTAGAGTTAAT 

SISQQPGVESPOAPQOAEVCKACISI 
SQLVSNQVWKVPRLPSRQKYAKHASQL 
LN. SATRCGKSPGSPAGRSMQSHHLN. 
1 1 1 1 ' 1 » I » i ■ ' I 1 1 1 1 h 

EIL.-C6PTSLGWA6WCASTHLAHMEIL 
.^NTLLWTHFTGLSGLLCFYAFCAD. N 
RL - DAVLHPFDGPEGAPLLICLMCRL. 

Ncol 

Sty I 

GTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCC6CCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATG 

' 1 ' ' » ■ ' ■ ' 1 I ' ■ I 1 ■ ' I ■ ■ i I -I I 1200 

CAGTCGTT6GTATCAGGGCG6GGATTGAGGCGGGTAGGGCGGGGATTGAGGCGGGTCAAGGCGGGTAAGAGGCGGGGTAC 

SQ QP, SRP, LRPSRP. LRPVPPILRPM 
VSNHSPAPNSAHPAPNSAQFRP F SAPW 
SAT I VPPLTPP I P PLTPPSS.AHSPPH 

1 i 1 1 ' I I I I » I ■ ■ : I t 1 1. 

CGYDRG. SR6DRG. SRGTGGMRRGM 
TLLWLGAGLEAW6AGLEAWNRGNEAGH 
DAVnTGGRVGGflGGRVGGLEAWEGGWP 

Bgl! 

Haelll Haelil i Hae 111 

1 I 11 

GCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGT6AGGAGGCTTT 

' ' ' I ' 1 t ■ I ' I 1 I I I I . I I I I 1280 

CGACT6ATTAAAAAAAATAAATACGTCTCCG6CTCCGGCGGAGCCGGAGACTCGATAAGGTCTTCATCACTCCTCCGAAA 

AD. FFLFMQRPRPPRPLSYSRSSEEAF 
LTNFFYLC RGRGRLGL. AIPEVVRRL 
G. LIFFIYAEAEAASASELFQK. . GGF 

— * H— H 1 ! i— -H 1 1 H i 1 , i , ^ 

AS. NKKNICLGLGGRGRL. ELLLSSAK 
SVLKK. KHLPRPRRPR'^Q AIGSTTLLSK 
OSIKKI. ASASAAEAESSNVFYiHPPK 

Hae Ml 
StuI 

Avrll Aval 
j.Styl ^ol 

TTTGGAGGCCTAGGCT TTTGCAAAAAGCTCCCTCGA6AGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATT 

" ' ' I ' ■ I ■ I ■ I ■ I I i I t r * I ' ' I I I ■ I 1360 

AAACCTCC66ATCCGAAAACGTTTTTC6ACGGAGCTCTC6AACC6CATTAGTACCAGTATCGACAAA6GACACACTTTAA 

LEA. AFAKSSLESLA. SWS. LFPV. N 
FWRPRLLQKAPSRAWRNHG HSCFLCE I 
FSGLGFCKKLPRELGYinVIAVSCVKL 

1 ' I 1 I 'll 1 »— ! 1 1 1 i 

KSA. AKAFLERSLKAYDHDYSN GTHFG 
QL GLSKCFAGELAORL. P. LOKR HSI 
KPPRPKQLFSGRSSPT IHTMATEGTFN 
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^sr8 I 



GTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGfiGTGCCTAATeACTCAcrrT.. 



GATTACTCACTCGATT ^''^^ 



CAA7AGGCGAGTGTTAAGGTGTGTTGTATGCTCGGCCTTCGTATTTCACATTTCGGACCCCACGI 

CYPLTIPHNIRAG SIKCKAWCA 
V I R S 0 F H T T Y E P E A S V K P c v p « * c ^ ' 
L S A H N S T Q H T S R K H k'v'^. ^ 'g \ \ 's 'e \ ' 



" » A . L E , C c , I , % *c \ S % % % \ I, 



Asel 



CTCACATTAATTGCGTTGCGCTCACTGCCC GCTTTCCAGTCGGGAAACCTfiTCCTr;, 



fvull Asel Haelll 



CCAGCTGCATTAATGAATCGGCCA 



GAGTGTAATTAACGCAAC6C6AGTGACGGGCGAAAGGTCA6CCCTTT6GACA6C 



< 1 -f 1520 



:ACG6TCGACGTAATTACTTAGCC6GT 



t- 1 ^- — , ^ 

°_ w s c . H I p w 



svnianresgakwdp 

. H L 0 T A S V A R \ % S "p V % S \ \ ^ °, ^ % 0^ L 

iSapl 

ACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTC CGCTTCCTCGCTCACTGACTCeCTnrr 
eCGCGCCCCTCTCCGCCAAACGCATAACCCSCGAG^ 

fisrei 



1600 



ACCCCGCTCGCCATAGTCGAGTGAGmcCGCCATTATGCCAATAGGTGlcTTAGTCCCliTATTicGTCCTTTCTTGTAC 

'q \ "t 'o 'r 'r ' \ \ \ % ^ ^ s e " N A G k ' N « 

DA E F A T I R N D V S D P S L A P P r M ^ 
• » t P , , E L % % 'l % •, V \ \ % \ % S 'c "t 



1680 



LRRAVSAHSK 
C G E R Y 
A A S G 



FIGURE 3F 



wo 00/22129 



8/19 



PCTAJS99/23938 



^ae ill jHae ill ^ae III 

TGA6CAAAA66CCAGCAAAAGGCCAGGAACC6TAAAAAGGCCGCGTTGCT6GCGTTTTTCCATA6GCTCCGCCCCCCT6A 

' ■ ' I ' ■ ■ ' I ■ I ' I I I ' I ' < 1 I I ■ . . . I i 1760 

ACTCGTTTTCCG6TCGTTTTCCGGTCCTT6GCATTTTTCCGGCGCAACGACCGCAAAAA6GTATCCGA6GCGGG6GGACT 

AK6QQ.KARNRKKAALLAFFHRLRPPD 
EQ KASKRP6TVKRPRCWR FS I6SAPL 
VSK RPAKGQEP. KGRV AGVi^P. APPP. 

I ' ■ ■ I I I I 'l l I I ■ ■ I I I I 

HAFPWCFALFRLFAANSANKWLSRGGS 
S C F A L L.L G P V T FLGRQQRKEMPEA G R V 
LLL6AFPWS GYFPRTAPTKGYA6 GGQ 

CGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTG 

1 ' I . ■ . I ■ . ■ I I i . . I i ■ ■ ■ I . ■ I I I I I I , , I 1540 

6CTC6TAGT6TTTTTAGCTGCGAGTTCAGTCTCCACCGCTTTGGGCT6TCCTGATATTTCTATGGTCCGCAAAGGGGGAC 

EHHKNRRSSQRWRNPTGL. RYQAFP P 
TSITKIDAOVRGGETRQDYKDTRRFPL 
RASQKSTLKS E VAKPDRTIK I PGVSPW 

' - I I I ■ ■ « 1 1 1 1 H I ' I ■ . I 

SC. LFRREL. LHRFGVPSYLYWAK GGP 
LMVFISA. TLP PSVRCS. LSVLRKG R 
RAOCFDVSLDS TAFGSLV IF I 6PTEGQ. 

GAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTG 

t ' I I ■ I I ■ I I I ■ ■ I I I I ■ ■ I I I I , . I 1920 

CTTCGAGGGAGCACGCGA6AGGACAAG6CTGG6ACGGCGAATGGCCTATGGACAGGCGGAAAGA6GGAA6CCCTTCGCAC 

GSSLVR SPVPTLPLTGYLSA F LPSG SV 
EAPSCALLFRPC RLPD TCPPFSLREAW 
KLPRALSCSDPAAYRIPVRLSPFGKR 

— 1 ■ ' ' ' 1 I 1 1 I ' ' I \ 1 ■ ■ ■ ■ I 1 1 I I I 

LERTREGTGVRGSVPYRDAKRGEPLT 
SAGEH AR RNRGQRKGSVQGGKERRSAH 
FSGRASEQESGAA. RIG TRREG KPFRP 

Li 

6CGCTTTCTCAAT6CTCACGCT6TAG6TATCTCAGTTCG6TGTAGGTCGTTCGCTCCAA6CTGG6CT6TGT6CACGAACC 

I I I ' I I ■ I ■ ' I - I I I 2000 

CGCGAAAGAGTTACGAGTGCGACATCCATAGAGTCAAGCCACATCCAGCAAGCGAGGTTC6ACCCGACACACGT6CTTGG 

ALSQCSRCRYLSSV. V'^ RSKLGCVHEP 
RFLNAH AVG I SVRCRSFAPSWA VCTN 
6AFSHLTL. VSQFGVGRSL QA6LCART 

i 1 1 1 ■ ■ I I ■ I ■ ■ ■ ■ I ■ . . ■ I ■ ■ ■ i . I . I ■ I I 

ASE. HERQLYRLETYT7RELSPQTCSC 
RKRLA. ATPIETRHLDNA6L0ATHVFG 
AKEI SVSYTD. NPTPRESVAPSHARV 

CCCCGTTCA6CCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAG7CCAACCCGGTAAGACACGACTTATCGCCAC 

' I ■ I I 1 ■ I ' I ■ I I ■ ' I I I I I ■ I I . ■ I ■ ■ I 2080 

GGGGCAAGTCGGGCTGGCGACGCGGAATAGGCCATTGATAGCAGAACTCAGGTTGGGCCATTCTGTGCTGAATAGCGGTG 

P V 0 P D R C A L S 6 N Y R L E S N P V R H O L S P 
PPFSPTAAPYPVTIVLSPTR. DT TYRH 
PRSA RPL RLIR. LSS. VOPGKT RLI AT 

1 1 1 r I II I 1 ■ I I 1 1 1 1 1- 

GT. GSRQAKDPL. RRSDLGTLCSKDGS 
GN LGVAAG. GTVITKL GVRYSVV. RW 
GREARGSRRIRYSDDQT WGP LVRSIAV 
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,Hae III 

TGGCAGCAGCCACTSGTAACAGGATTAGC AGAGCGAGGTATGTAGGCGGTGCTAC^ 

accgtcgtcggtgaccTttScc^^ 



ATTG 



H- 2160 



"■w \ \ % ^ \ V \ \ \ \ \' ' ^ " 

c qcuu n \ \ ^ ^ ' AVLQSS. SGGLT 

^ ^ ^ " y • ,° ° , • ° s , g V , c R R c r R y L E V V A 

* * * ^ P «- «- I L L A L Y T P P 'a v' S N K 'f u r . ' ' 

TACBGCTACAaAGAAGGACAGTATTTBGTATCTG CGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAG CTC 
ATGCCGATGTGATCncCTGTCATAAACCATAGACGCGAGACGACTTCG(lTCAAlGGAAKCmTTCTiAACCATCGAi 

't 'a 't 'l \ % 'q \ % ^ \ ^ T F G K R V G S S 

"v'a'v "s's'p V % 'n°. \ V ^ •< L I P 'l e' 

ttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaa aaaaggat 

MCTAGGCCGTTTGTTTGGTGGCGACCATCGCCACCAAAMAAC^AACGnCGTCGTCTiATGcicGTclTTTTmCTi 



S_ ^. °- T A G S G G F V_ c IC Q 0 I T R R K K G 

.f ' 

■+- i 1—. f- 



L D P A N K P P L V A V V F L F A S S R L R 4 c ir V 

L IRQTNH RW puc r ,. . « '-"*^'^ ''D 
■ I I I i" ' , ^ " ^ ^ OAADYAOKKRI 



« I « c , F . R 0 r , W ^ ^ 'c 'a S ". 'c % % 



.BspH I 

CTCAAGAAGATCCTTTGATCTTTTCTACGGGGTC TGACGCTCAGTGGAACGAAAACTCACGTTAAGGGAT TTTGG^CATG 
eAGTTCTTCTAGGAAACTAGAAAAGATGCCCCAGACTGC(UGTCACCTTicTTnGAGT(lcAATlcCCTMAACCAGTAi 

'l \ \ °, "l ^ 's % \ \ \ \ \ ° « " S R . G I L V H 
^ ° V« • P T Q R eS%'r%^-v\'-,%\°p° «3 

Dra I pre I 

AGATTATCAAAAAGGAT CTTCACCTAGATCCTm'AAATTAAAAATGAAGTTTTA^ 

TCTAATAGTTTncCTAGAAGTGGATCTAGGAAAATTTAiTTTnACTTiAAAAmAGlTAGAmCAlATATACTCAl ^"^^ 
R L S K R 1 F T . I 

Dyoicgsspb* 

' ' ' ' ' ° L " I oyr^ww , ^ \ \ \ \ " \ 



S S P R 's S ^ '.r ^ ^ « ' ■ S I Y E 

's ' ° % "-P Ve'g L Vk' ^ •-'-^ L D ', . L ! Y 's y' 

HGURE 3^ 



wo 00/22129 



10/19 



PCT/US99/23938 



AACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGCCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTG 

» I -H 1 1 i ■ ' ' I i ' ■ \ ■ « 1 i 1 I ' I 2660 

TTGAACCAGACTGTCAATGGTTACGAATTAGTCACTCCGTGGATAGAGTCGCTAGACAGATAAAGCAAGTAGGTATCAAC 

7WSDSY0CLISEAPISAICLFRSSIV 
KL6LTVTNA. SVRHLSQRSVYFVHP. L 

NLV. OLPMLNQ. GTYLSDLSISFIHSC 
1 H 1 1— H 1 I 1 1 1— H 1 1 , H 

VQOSL. WHKILSAGIEAIQRNREDHTA 
SPRVTVLA.DTLCRO.RDT.KT.GYN 
FKTQCN6ISL. HPV. RLS RDIENI1WLQ 

iHae in 

I 

CCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACC6C6AGAC 

' ' ' ' > I ' . I I I I I I I I I I I I , I 2640 

GGACTGAGGGGCAGCACATCTATTGATGCTATGCCCTCCCGAATGGTAGACCGGGGTCACGACGTTACTAT6GCGCTCTG 

A.LPVV. ITTIREGLPSGPSAAMIPRD 
POSPSCR.LRYGRAYHLAPVLQ. YRET 
LTPRRVDNYDTGGLT IWPGCCNDTAR 
1 ! 1 1 1 1 1 i i 1 , H^^^-H \ i H 

QSG TTYIVVIRSPKGDPGLAAI IGRS 
GSEGDHLYSRYPLA. WRAGTSCHYRSV 
RVGRRTSL. SVP.PSV MQGWHQLSVALC 

Bgll 

.Hae III Ava II 

1 I I 

CCACGCTCACCGGCTCCAGATTT ATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTT 

' ' ' ■ ' I I.I ! t I I ■ I ■ I I I I ■ I i i I 2720 

66TGCGA6TGGCCGAGGTCTAAATAGTCGTTATTTGGTCGGTCGGCCTTCCCGGCTCGCGTCTTCACCAGGACGTT6AAA 

PRSPAPDLSAINQPAGRAERRSGPATL 
HAHRLQIYOQ. TSQPEGPSAEVVLGL 

PTLTGSRFISNKPASR KGRAQICWSCNF 
• 1 ^ \ • 1 1 1 i I I 1 1 1 H 

GREGA6SKDAIFWG APLASRLLP6AVK 
WA. RSWI. . CYVLWGSPGLASTTRCS. 
VSVPELNILLLGALRFPRACFHDQLK 

Asel jgcil fspl 

ATCC6CCTCCATCCAGTCT ATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTG 

' ' ' I ' ' I \ ' I ■ ■ I ■ ■ I I I I I ( ] I I I I I j , , I „ I 2800 

TAGGCGGAGGTAGGTCAGATAATTAACAACGGCCCTTCGATCTCATTCA'tCAAGCGGTCAATTATCAAACGCGTTGCAAC 

SASIQSINCCREARVSSSPVNSLRNV 
rPPPSSLLIVAGKLE. VVRQLI VCATL 
^RLHPVY. LLP6S. SK. FAS. . FAQR C 

' I ' l l 1 ' I 1 1 » H— H 1 i I 

O AEHWOILQQRSALTLLEGTLLKRLTT 

eCGDLRN I TAPFSSYT TRWN 1 T OAVN 
^''RWGT. , NNGPL. LLYNAL. YNACRQ 

'TGCCATTGCTACAGGCATCGTCGT GTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGG 

^ ' ' ' ' ' i ■ ' I i ■ I ■ I I I I ■ I I I I f I I I , j 2860 

•ACG6TAACGATGTCCGTAGCACCACAGTGCGA6CAGCAAACCATACCGAAGTAAGTCGAG6CCAAGG6TT6CTAGTTCC 

'AIATG I VVSRSSFGMASFSSGS QRSR 
l-PLLQASWCHARRLVWLHSAPVPNDQG 
C HCYRHRGVTLVVWYGFIQLRFPTIK 

^ H— ' 1 f » ' I ) ■ I ■ I I 1 ! 1 : I 

AMAVPflTTOREDNP lAENLEPEWROL 
GNS CADHH. ARRKTHS. EAGT6LS. P 
QWQ. LCRPTVSTTQ N C V I L A 



wo 00/22129 



11/19 



PCT/US99/23938 



Avail pvul ^^ae||l 

CCAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTSTCAeAAGTAAGTT 

GCTCAATfiTACTAGGGGGTACAACACGTTTTTTCGCCAATCGAGGAAGCCAGGAGGCTAGCAACAGTCTTCATTiAACCG ^ 

RVT. SPMLCICKAVSSFGPPIvVRcsi^. « 
E L H D P P C C A K K R L A P S V L R S L S E V \ u * 

R T V H 0 G m' N H L F a' T L E K p' G G I 't T I ,' , *' 

\ ". ^ % s 'c % ^ % ". \ \ \ \ «,\'»;«;o \\\\\ 

CGCAGTGTTATCA CTCATGGTTATGGCAGCACTGCATAATTCTCTTACT6TCATCCCATCCGTAAGATGCTTTTCTGTG A 
GCGTCACAATAGTGAGTACCAATACCGTcdTGACGTATTAAGAGAATGAiAGTACGGTAMCAnCTAciAAAAGACAci 

AVLSLMVHAALHNSLTVMPSWRrircu 
P Q C y H S W L W 0 H C 1 I L L L S C H P n i % ^ ^ 

A 7 N 0 S. « T I A A S C L E r' V i M G d' T L H 'iC e' T i 
• EHNHCCOMIRKSDHW G Y S A if R u 
*■ ' ' *• ' PLVAY NE. Q. AMRLISKQS 

Rsa I 
Sea I 

I >leil hineW 

CTGGTGAGTACTCAAC CAAGTCATTCTGAGAATAGTGTATGCGGC6ACCGAGTTGCTCTTGCCCGGCGTC AACACGGGAT 
GACCACTCATGAGTTGGTTCAGTAAGACTCTTATCACATACGCCGCTGGCTCAACGAGAACGGGCCGCAilTGTGCCCTi 

pra I Xmn I 

aataccgcgccacata gcagaacttt'aaaagtgctcatcattggaaaacgttcttcggggcgaaaactctcaagg atctt 
ttatggcgcggtgtatcgtcttgaaattttcacgagtagIaaccttttgcaagaagccc<1gcttttgaga6Ttcctagaa 

I I ; " y w F F G A K T L K 0 L 



WW Vc %V/^ "» "« ^W%WWvT\/ 

/ " * " ^ C F K L L A . . 0 F V N K P A F V R L S R 
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ACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTT 

I 1 1 ' I t I ' 1 ' 1 1 ■ t 1 1 ■ ■ t »- 3280 

TGGCGACAACTCTAGGTCAAGC7ACATTGGGTGA6CACGTGGGTTGACTAGAAGTCGTAGAAAAT6AAAGTGGTCGCAAA 

PLLRSSSH. PTRAPN. SSASFTFTSV 
YRC. DPVRCNPLVHPTDLOHLLLSPAF 

TAVEIOFDVTHSCTOLIFS IFYFHQ RF 
— , 1 1 1 1 1 1 1 1 ! 1 i 1 1 J 1. 

GSNLDL EIYGV R AGLQOEADKVKVLTE 
R0QSGTRHL6STCGVSR. CRKSEGAN 
VATSIWNSTVWEHVWSIKLhK. K. WRK 

CTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTC 

I ■ ■ I I I I . I . I I i I 1 1 ' I ■ I ' ■ I \ 3360 

6ACCCACTCGTTTTTGTCCTTCCGTTTTACGGC6TTTTTTCCCTTATTCCCGCTGTGCCTTTACAACTTATGAGTATGAG 

SG.AKTGRQNAAKKGIRATRKC. ILIL 
LGEQKOEGKMPQKRE. GRHGNVEYSYS 
WVSKNRKAKCRKKGNKGOTEM L NTHT 

I I ' 1 i ■ I 1 ' ' t ■ 1 1 I I I ■ ■ I 

PHAFVPLCFAAFFPILAVRFHQ ISMS 
RPSCFCSPL I GCFLSYPRCPFTSYEYE 
OTLLFLFAFHRLFPFLPSVSINFV. VR 



^inc II jSpe I Ase\ 



TTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGCGCGTTGACATTGATTATTGACTAGTTATTAA 

' I ' I I I I ' I ' ■ I I 1 1 I I ' I 3440 

AAGGAAAAAGTTATAA7AACTTCGTAAATAGTCCCAATAACA6AGTACGCGCAACTGTAACTAATAACT6A7CAATAATT 

FLFQYY. SIYQGYCLMRVDIDY. LVIN 
SFFNIIEAFIRVIVSCALTLIID. LL 
LPFSILLKHLSGLLSHAR. H. LLTSY. 

I ' I I ■ ■ I I I ■ I I ■ : ■ I I I I I I 1 I 

KRK. Y. Q Ln. . P. QRMRTSMS. QSTIL 
EKKLIISANILTITEHANVNIIS. NNI 
GKEINNFCKDP NND. ARQCQNNVL. . 

>iae III fig! I 

! I 

TAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCC 

' ' ' 1 1 ' I I I . I I ■ I I 1. 1 I i 1. ■ i i I I 3520 

ATCATTAGTTAATGCCCCAGTAATCAAGTATCGGGTATATACCTCAAGOpGCAATGTATTGAATGCCATTTACCGGGCGG 

SNQLRGH. FIAHIWSSALH NLR. MAR 
IVINY6VISS. PIY6VPRYITYGKWPA 
. . SITGSLVHSPYHEFRVT. LTVNGPP 

— ^— ^ 1 I I I I I I I I I ' I ■ I ■ I 

LL. NRP. . NMAWIHLEANCLKRY IARR 
TIL - PTMLEYGMYPTGR. HV. PLHGA 
YT0IVP0N7. LGYISNR7VYSV7FPGG 
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^11 



TGGCTGACCGCCCAAC GACCCCCGCCCATTGACGTCAATAATCACGTATGTTCCCATAGTAACGCCAATACGGAC TTTCC 
ACCGACTGGCGGSTTGCTGGGGGCGGGTAACTGCAGT7ATTACTGCATAWAGG(iTATCATTGCGGTTAlcCCTGAAAGi 
LADRPTTPAH RO P Mrn 

^ K \ r r I 0 \ \ „ •„ V c/s/h -s -K °„ „ \ ^ 



PPNDPRPLTSIMT 



^VPIVTPIGTF 



. \ \ \ \ \ \ \ \ ■„ % \ ; \ \ «, '„ " ' V. " ' = E ■ 



Aatil 



3gl ' fisa I 



attgacgtcaatgggtggagtatttacggtaaactgcccacttggcagtacatcaagtgtatc a't. 

TAACTGCAGTTACCCACCTGATAAATGCCATTTGACGGGTGAACCGTCATGTAGTTCACATAGT 



ATGCCAAGTACGCC C 
ATACGGTTCATGCGGG 



IDVNGWT I YGKLPTWQYiicr i ir„w „ 
LTSMGGLFTVNCP i r e -, - '' ' ICQVRP 

K . . ■ . , .wy. \Vh\\\\\\\^ \ \ \ \ % 

_ _ ' ' ' ' I l i t ■ ^ I 



S.. L K P_ V. ' • P I- S G V Q C Y « L H 1 « H W T R 

LVOLTDYALY 
*TC. TY. I6LVG 



NVOIPPSNVTFOespi un. 
OR. H T S . K R Y V A W K ^ ^ ^- " V * «• ^ * e 



>iaeIMjBfll| ^g,, 

CCTATTCACGT-CAATSACGGTAAATGGCCCGCcUG CATTATGCCCAGT^TGACCTTATGGGAC: TTTrrT.rTT..r. 
GGATAACTGCAGTTACTGCCATTTACCGGGCGGACCGTA;TACG6GTCAlGTAC;GGAAlACCCTGAAAiGATG;ACCG{ ''^ 



I J I 1 I I I , ^ ° T ^ ^ LWDFPTWQ 
^ " ^ Q C . A W r M V K H S K G V Q C 



BsaAi 
:Rsa I jSnaB I 




3840 



' 'c °« 'r L "r ' " \ "p "s 't % ^ S » » ■ P T S L pT 
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Aatll 

ACTCACGCGGATT TCCAAGTCTCCACCCCATTGACCTCAATCGGAGTTTGTTTTGGCACCAAAATCAACG^ 
TGAGTGCCCCTAAAGGTTCAGAGGTGGGGTAACTGCAGTTACCCTCAAACAAAACCGTGGTTTTAGTTGCCCTGAAAGGT 

^ L S ^ S P P „ . R 0 W E F V 'l % S 't 'g L S ^ 



D S R G F P L H P 1 D V N G S F W H 0 N Q R D > > 
^ •„ ^ ^ ^ t_ E. V G N V O 1 P T o' K P V V / L i V 'k W 



^ e W Q R 'h ' s 'n " " ° " 

L K N Q C W F . R S 'k "g 



E R P N G L R W G H "S T L > ^- *- « ^ » P S E f 



^sal 



AAATGTCGTAACAACTCCGCCCCATTGAC CCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGC T^^ 
TTTACAGCATTGTTGAGGCGGGGTAACTGraTTTACCCGCCATCCGCAciTGCCACCCTicAGAiATATtCGTCTCGACA 
NVVTTPPH.RKWAVGVYGGRSt *r. 

\ \ \ \ \ \ \ % \ s "p % \ % ^„ ^ % % s 0 . V 'a s^s ^ 

F H R L L E A G N V C I P P L R T R H S T '/l 'l ^ % 

Asel 

ctggctaactagagaacccactgcttaactggcttatcgaaat'taatacgactcactatagg gagaccc 
gaccgattgatctcttgggtgacgaattgaccga^tagctttaattatgctgagtgatatccctctggg ""^^ 

'■^'*''''-'-*'''LIEINTTHYRPT 

'.V v.W". V Vv V%'« \ \V • " 

5A. RF. YSESYPSG 
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Esqsresslon plamid 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



10 



(i) APPLICANT: Behan, Dominic P. 

Chalmers, Derek T. 
Liaw/ Chen W. 

(ii) TITLE OF INVENTION: Non- Endogenous , Constitutively 

Activated Hxameui G Protein- Coupled 
Orphan Receptors 

(iii) NUMBER OF SEQUENCES: 280 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Arena Pharmaceuticals, Inc. 

(B) STREET: 6166 Nancy Ridge Drive 

(C) CITY: San Diego 
15 (D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 92122 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

20 (B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 



(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: US 
25 (B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Burgoon, Richard P. 

(B) REGISTRATION NUMBER: 34,787 

30 (ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (619)453-7200 

(B) TELEFAX: (619)453-7210 



(2) INFORMATION FOR SEQ ID N0:1: 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 1068 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 



ATGGAAGATT TGGAGGAAAC ATTATTTGAA GAATTTGAAA ACTATTCCTA TGACCTAGAC 60 
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TATTACTCTC TGGAGTCTGA TTTGGAGGAG AAAGTCCAGC TGGGAGTTCT TCACTCGGTC 120 
TCCCTCGTGT TATATTGTTT GGCTTTTGTT CTGGGAATTC CAGGAAATCC CATCGTCATT 180 
TGGTTCACGG GGCTCAAGTG GAAGAAGACA GTCACCACTC TGTGGTTCCT CAATCTAGCC 240 
ATTGCGGATT TCATTTrrCT TCTCTTTCTC CCCCTGTACA TCTCCTATCT GGCCATGAAT 300 
5 TTCCACTGGC CC^SGCAT CTGGCTGT6C AAAGCCAArr CCTTO.CTGC 3^0 
ATGITTCCCA GTGTrmTT CCTGACA6TG ATCAGCCTGG ACCACTATAT CCACITGATC 420 
CATCCTGTCT TATCTCATCG GCATCGAACC CTCAAGAACT CTCTGATTGT CATTATATTC 480 
ATCTGGCrrr TGGCrrCTCT AAITGGCGGT CCT^CCCTGT ACTTCCGGGA CACTGTGGAG 540 
TTCAATAATC ATACTCTTTG CTATAACAAT TTTCAGAAGC ATOATCCTGA CCTCACTTTG 
10 ATCAGGCACC ATGTTCTGAC TTGGGTGAAA TTTATCATTG GCTATCTCTT CCCTTTGCTA 
ACAATGAGTA TTTGCTACTT GTGTCTCATC TTCAAGGTGA AGAAGCGAAC AGTCCTGATC 
TCCAGTAGGC ATTTCTGGAC AATTCTGGIT GTGGTTGTGG CCrTTGTOGT TTGCTGGACT 780 
CCTTATCACC TGTTTA6CAT TTG6GAGCTC ACCATTCACC ACAATAGCTA TTCCCACCAT 
GTGATGCAGG CTGGAATCCC CCTCTCCACT GGTTTGGCAT TCCTCAATAG TTGCTTGAAC 
CCCATCCrrr ATGTCCTAAT TAGTAAGAAG TTCCAAGCTC GCTTCCGGTC CTCAGTKSCT 960 
GAGATACTCA AGTACACACT GTGGGAAGTC AGCTGTTCTG GCACAGTGAG TGAACAGCTC 1020 
AGGAACTCAG AAACCAAfiAA TCTGTGTCTC CTG6AAACAG CTCAATAA 
(3) INFORMATION FOR SEQ ID NO:2 2 

(i) SEQUENCE CmRACTERISTICS : 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOIjOGY: not relevant 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0;2: 

Met Glu Asp Leu Glu Glu Thr Leu Phe Glu Glu Phe Glu Asn Tyr Ser 



600 

660 
720 



840 

900 



10 



15 



Tyr Asp Leu ASP Tyr Tyr Ser Leu Glu Ser Asp Leu Glu Glu Lys Val 
20 25 30 

Gin Leu Gly val Val His Trp Val Ser Leu Val Leu Tyr Cys Leu Ala 

40 



45 



wo 00/22129 



PCT/US99/23938 



Phe Val Leu Gly lie Pro Gly Asn Ala lie Val lie Trp Phe Thr Gly 
50 55 60 

Leu Lys Trp Lys Lys Thr Val Thr Thr Leu Trp Phe Leu Asn Leu Ala 
65 70 75 80 

lie Ala Asp Phe lie Phe Leu Leu Phe Leu Pro Leu Tyr lie Ser Tyr 
85 90 95 

Val Ala Met Asn Phe His Trp Pro Phe Gly lie Trp Leu Cys Lys Ala 
100 105 110 

Asn Ser Phe Thr Ala Gin Leu Asn Met Phe Ala Ser Val Phe Phe Leu 
115 120 125 

Thr Val lie Ser Leu Asp His Tyr He His Leu He His Pro Vail Leu 
130 135 140 

Ser His Arg His Arg Thr Leu Lys Asn Ser Leu He Val He He Phe 
145 150 155 160 

He Trp Leu Leu Ala Ser Leu He Gly Gly Pro Ala Leu Tyr Phe Arg 
165 170 175 

Asp Thr Val Glu Phe Asn Asn His Thr Leu Cys Tyr Asn Asn Phe Gin 
180 185 190 

LyS' His Asp Pro Asp Leu Thr Leu He Arg His His Val Leu Thr Trp 
195 200 205 

Val Lys Phe He He Gly Tyr Leu Phe Pro Leu Leu Thr Met Ser He 
210 215 220 

Cys Tyr Leu Cys Leu He Phe Lys Val Lys Lys Arg Thr Val Leu He 
225 230 ' 235 240 

Ser Ser Arg His Phe Trp Thr He Leu Val Val Val Val Ala Phe Val 
245 250 255 

Val Cys Trp Thr Pro Tyr His Leu Phe Ser He Trp Glu Leu Thr He 
260 265 270 

His His Asn Ser Tyr Ser His His Val Met Gin Ala Gly He Pro Leu 
275 280 285 

Ser Thr Gly Leu Ala Phe Leu Asn Ser Cys Leu Asn Pro He Leu Tyr 
290 295 300 

Val Leu He Ser Lys Lys Phe Gin Ala Arg Phe Arg Ser Ser Val Ala 
305 310 315 320 

Glu He Leu Lys Tyr Thr Leu Trp Glu Val Ser Cys Ser Gly Thr Val 
325 330 335 

Ser Glu Gin Leu Arg Asn Ser Glu Thr Lys Asn Leu Cys Leu Leu Glu 
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340 345 



350 



Thr Ala Gin 
355 



60 



(4) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1089 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATGGGCAACC ACACGTGGGA GGGCTGCCAC GTGGACTCGC GCGTGGACCA CCTCTTTCCG 
CCATCCCTCT ACATCTTTGT CATCGGCGTG GGGCTGCCCA CCAACTGCCT GGCTCTGT6G 120 
GCGGCCTACC GCCAGGTGCA ACAGCGCAAC 6AGCT6GGCG TCTACCTGAT GAACCTCAGC 180 
ATCGCCGACC TGCTGTACAT CTGCACGCTG CCGCTGTGGG TGGACTACTT CCTGCACCAC 
GACAACTGGA TCCACGGCCC CGGGTCCTGC AAGCTCTTTG GGTTCATCTT CTACACCAAT 
ATCTACATCA GCATCGCCTT CCTGTGCTGC ATCTCGGTGG ACCGCTACCT GGCTGTGGCC 
CACCCACTCC GCTTCGCCCG CCTGCGCCGC GTCAA6ACCG CCGTGGCCGT GAGCTCCGTG 
GTCTGGGCCA CGGAGCTGGG CGCCAACTCG GCGCCCCTGT TCCATGACGA GCTCTTCCGA 
GACCGCTACA ACCACACCTT CTGCTTTGAG AAGTTCCCCA TGGAAGGCTG GGTGGCCTGG 
ATGAACCTCT ATCGGGTGTT CGTGGGCTTC CTCTTCCCGT GGGCGCTCAT GCTGCTGTCG 
TACCGGGGCA TCCTGCGGGC CGTGCGGGGC AGCGTGTCCA CCGAGCGCCA GGAGAAGGCC 
AAGATCAAGC GGCTGGCCCT CAGCCTCATC GCCATCGTGC TGGTCTGCTT TGCGCCCTAT 
CACGTGCTCT TGCTGTCCCG CAGCGCCATC TACCTGGGCC GCCCCTGGGA CTGCGGCTTC 
GAGGAGCGCG TCTTTTCTGC ATACCACAGC TCACTGGCTT TCACCAGCCT CAACTGTCTG 
GCGGACCCCA TCCTCTACTG CCTGGTCAAC GAGGGCGCCC GCAGCGATGT GGCCAAGGCC 900 
CTGCACAACC T6CTCCGCTT TCTGGCCAGC GACAAGCCCC AGGAGATGGC CAATGCCTCG 960 
CTCACCCTGG AGACCCCACT CACCTCCAAG AGGAACAGCA CAGCCAAAGC CATGACTGGC 1020 
AGCTGGGCGG CCACTCCGCC TTCCCAGGGG GACCAGGTGC AGCTGAAGAT 6CTGCCGCCA 1080 
GCACAATGA 

1089 



240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 
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(5) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 362 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 



Met Gly Asn His Thr Trp Glu Gly Cys His Val Asp Ser Arg Val Asp 
1 5 10 15 

His Leu Phe Pro Pro Ser Leu Tyr lie Phe Val He Gly Val Gly Leu 
20 25 30 

Pro Thr Asn Cys Leu Ala Leu Trp Ala Ala Tyr Arg Gin Val Gin Gin 
35 40 45 

Arg Asn Glu Leu Gly Val Tyr Leu Met Asn Leu Ser He Ala Asp Leu 
50 55 60 

Leu Tyr He Cys Thr Leu Pro Leu Trp Val Asp Tyr Phe Leu His His 
65 70 75 80 



Asp Asn Trp He His Gly Pro Gly Ser Cys Lys Leu Phe Gly Phe He 
85 90 95 



Phe Tyr Thr Asn He Tyr He Ser He Ala Phe Leu Cys Cys He Ser 
100 105 110 

Val Asp T^g Tyr Leu Ala Val Ala His Pro Leu Arg Phe Ala Arg Leu 
115 120 125 

Arg Arg Val Lys Thr Ala Val Ala Val Ser Ser Val Val Trp Ala Thr 
130 135 140 

Glu Leu Gly Ala Asn Ser Ala Pro Leu Phe His Asp Glu Leu Phe Arg 
145 150 155 160 

Asp Arg Tyr Asn His Thr Phe Cys Phe Glu Lys Phe Pro Met Glu Gly 
165 170 175 

Trp Val Ala Trp Met Asn Leu Tyr Arg Val Phe Val Gly Phe Leu Phe 
180 185 190 

Pro Trp Ala Leu Met Leu Leu Ser Tyr Arg Gly He Leu Arg Ala Val 
195 200 205 

Arg Gly Ser Val Ser Thr Glu Arg Gin Glu Lys Ala Lys He Lys Arg 
210 215 220 



Leu Ala Leu Ser Leu He Ala He Val Leu Val Cys Phe Ala Pro Tyr 
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225 230 235 



240 



His Val Leu Leu Leu Ser Arg Ser Ala He Tyr Leu Gly Arg Pro Trp 
245 250 255 

Asp Cys Gly Phe Glu Glu Arg Val Phe Ser Ala Tyr His Ser Ser Leu 
260 265 270 

Ala Phe Thr Ser Leu Asn Cys Val Ala Asp Pro He Leu Tyr Cys Leu 
275 280 285 

Val Asn Glu Gly Ala Arg Ser Asp Val Ala Lys Ala Leu His Asn Leu 
290 295 300 

Leu Arg Phe Leu Ala Ser Asp Lys Pro Gin Glu Met Ala Asn Ala Ser 
305 310 315 320 

Leu Thr Leu Glu Thr Pro Leu Thr Ser Lys Arg Asn Ser Thr Ala Lys 
325 330 335 

Ala Met Thr Gly Ser Trp Ala Ala Thr Pro Pro Ser Gin Gly Asp Gin 
340 345 350 

Val Gin Leu Lys Met Leu Pro Pro Ala Gin 
355 360 

(6) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear ' 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TATGAATTCA GATGCTCTAA ACGTCCCTGC 

(7) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

TCCGQATCCA CCTGCACCTG CGCCTGCACC 

(8) INFORMATION FOR SEQ ID NO: 7: 



30 



30 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1002 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

ATGGAGTCCT CAGGCAACCC AGAGAGCACC ACCTTTTTTT ACTATGACCT TCAGAGCCAG 60 

CCGTGTGAGA ACCAGGCCTG GGTCTTTGCT ACCCTCGCCA CCACTGTCCT GTACTGCCTG 120 

GTGTTTCTCC TCAGCCTAGT GGGCAACAGC CTGGTCCTGT GGGTCCTGGT GAAGTATGAG 180 

AGCCTGGAGT CCCTCACCAA CATCTTCATC CTCAACCTGT GCCTCTCAGA CCTGGTGTTC 240 

GCCTGCTTGT TGCCTGTGTG GATCTCCCCA TACCACTGGG GCTGGGTGCT GGGAGACTTC 300 

CTCTGCAAAC TCCTCAATAT GATCTTCTCC ATCAGCCTCT ACAGCAGCAT CTTCTTCCTG 360 

ACCATCATGA CCATCCACCG CTACCTGTCG GTAGTGAGCC CCCTCTCCAC CCTGC6CGTC 420 

CCCACCCTCC GCTGCCGGGT GCTGGTGACC ATGGCTGTGT GGGTAGCCAG CATCCTGTCC 480 

TCCATCCTCG ACACCATCTT CCACAAGGTG CTTTCTTCGG GCTGTGATTA TTCCGAACTC 540 

ACGTGGTACC TCACCTCCGT CTACCAGCAC AACCTCTTCT TCCTGCTGTC CCTGGGGATT 600 

ATCCTGTTCT GCTACGTGGA GATCCTCAGG ACCCTGTTCC GCTCACGCTC CAAGCGGCGC 660 

CACCGCACGG TCAAGCTCAT CTTCGCCATC GTGGTGGCCT ACTTCCTCAG CTGGGGTCCC 720 

TACAACTTCA CCCTGTTTCT GCAGACGCTG TTTCGGACCC AGATCATCCG GAGCTGCGAG 780 

GCCAAACAGC AGCTAGAATA CGCCCTGCTC ATCTGCCGCA ACCTCGCCTT CTCCCACTGC 840 

TGCTTTAACC CGGTGCTCTA TGTCTTCGTG GGGGTCAAGT TCCGCACACA CCTGAAACAT 900 

GTTCTCCGGC AGTTCTGGTT CTGCCGGCTG CAGGCACCCA GCCCAGCCTC GATCCCCCAC 960 

TCCCCTGGTG CCTTCGCCTA TGAGGGCGCC TCCTTCTACT GA 1002 
(9) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
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Met Glu Ser Ser Gly Asn Pro Glu Ser Thr Thr Phe Phe Tyr Tyr Asp 
5 10 15 

Leu Gin Ser Gin Pro Cys Glu Asn Gin Ala Trp Val Phe Ala Thr L-^u 
^° 25 30 ~ 

Ala Thr Thr Val Leu Tyr Cys Leu Val Phe Leu Leu Ser Leu Val Gly 



■*M 45 
Asn ser Leu Val Leu Trp Val Leu Val Lys Tyr Glu Ser Leu Glu Ser 

Leu Thr Asn He Phe He Leu Asn Leu Cys Leu Ser Asp Leu Val Phe 

"^^ 75 30 

Ala Cys Leu Leu Pro Val Trp lie Ser Pro Tyr His Trp Gly Trp Val 



85 90 



95 



Leu Gly Asp Phe Leu Cys Lys Leu Leu Asn Met lie Phe Ser He Ser 



100 



105 



110 



Leu Tyr Ser Ser He Phe Phe Leu Thr He Met Thr He His Arg Tyr 

120 

Leu ser Val Val Ser Pro Leu Ser Thr Leu Arg Val Pro Thr Leu Arg 

135 

Cys Arg Val Leu Val Thr Met Ala Val Trp Val Ala Ser He Leu Ser 

150 



155 



160 



Ser He Leu Asp Thr He Phe His Lys Val Leu Ser Ser Gly Cys Asp 



165 170 



175 



Tyr ser Glu Leu Thr Trp Tyr Leu Thr Ser Val Tyr Gin His Asn Leu 
"0 185 130 

Phe Phe Leu Leu Ser Leu Gly He He Leu Phe Cys Tyr Val Glu He 
195 200 205 

Leu Arg Thr Leu Phe Arg Ser Arg Ser Lys Arg Arg His Arg Thr 



215 



Val 



220 



Lys Leu He Phe Ala He Val Val Ala Tyr Phe Leu Ser Trp Gly Pro 

230 



235 



240 



Tyr Asn Phe Thr Leu Phe Leu Gin Thr Leu Phe Arg Thr Gin He He 
245 250 255 

Arg ser Cys Glu Ala Lys Gin Gin Leu Glu Tyr Ala Leu Leu He Cys 

265 270 

Arg Asn Leu Ala Phe Ser His Cys Cys Phe Asn Pro Val Leu Tyr Val 

280 285 
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Phe Val Gly Val Lys Phe Arg Thr His Leu Lys His Val Leu Arg Gin 
290 295 300 

Phe Trp Phe Cys Arg Leu Gin Ala Pro Ser Pro Ala Ser lie Pro His 
305 310 315 320 

5 Ser Pro Gly Ala Phe Ala Tyr Glu Gly Ala Ser Phe Tyr 

325 330 

(10) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 base pairs 

10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



15 GCAAGCTTGG GGGACGCCAG GTCGCCGGCT 

(11) INFORMATION FOR SEQ ID NO: 10: 

20 



GCGGATCCGG ACGCTGGGGG AGTCAGGCTG C 
25 (12) INFORMATION FOR SEQ ID NO: 11: 



30 



31 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 987 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATGGACAACG CCTCGTTCTC GGAGCCCTGG CCCGCCAACG CATCGGGCCC GGACCCGGCG 60 

CTGAGCTGCT CCAACGCGTC GACTCTGGCG CCGCTGCCGG CGCCGCTGGC GGTGGCTGTA 120 

35 CCAGTTGTCT ACGCGGTGAT CTGCGCCGTG GGTCTGGCGG GCAACTCCGC CGTGCTGTAC 180 
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GTGTTCCTGC GGGCGCCCCG CATGAAGACC GTCACCAACC TCTTCATCCT CAACCTGGCC 240 

.ATCGCCGACG AGCTCTTCAC GCTGGTGCTG CCCATCAACA TCGCCGACTT CCTGCTGCGG 300 

CAGTGGCCCT T.CGGGa«GCT CATGTGCAAG CTCATCGTGG CTATCGACCA GTACAACACC 360 

TTCTCCAGCC TCTACTTCCT CACCGTCATG AGCGCCGACC GCTACCTGGT GGTGTTGGCC 420 

ACTGCGGAGT CGCGCCGGGT GGCCGGCCGC ACCTACAGCG CCGCGC6CGC GGTGAGCCTC 480 

GCC6TGTGGG GGATCGTCAC ACTCGTCGTG CTGCCCTTCG CAQTCTTCGC CCGGCTAGAC 540 

GACGAGCAGG GCCGGCGCCA GTGCGTGCTA GTCTTTCCGC AGCCCGAGGC CTTCTGGTGG 600 

CGCGCGAGCC GCCTCTACAC GCTGGTGCTG GGCTTCGCCA TCCCCGTGTC CACCATCTGT 660 

GTCCTCTATA CCACCCTGCT 6TGCCGGCTG CATGCCATGC G6CTGGACAO CCACGCCAAG 720 

GCCCTGGAGC GCGCCAAGAA GCGGGTGACC TTCCTGGTGG TGGCAATCCT GGCGGTGTGC 780 

CTCCTCTGCT GGACGCCCTA CCACCTGAGC ACCGTGGTGG CGCTCACCAC CGACCTbcCG 840 

CAGACGCCGC TGGTCATCGC TATCTCCTAC TTCATCACCA GCCTGACGTA CGCCAACAGC 900 

TQCCTCAACC CCTTCCTCTA CGCCTTCCTO GACGCCAGCT TCCGCAGGAA CCTCCGCCAG 960 
CTOATAACTT GCCGCGCGGC AGCCTGA 
(13) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 328 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met. Asp Asn Ala Ser Phe Ser Glu Pro Trp Pro Ala Asn Ala Ser Gly 
^ . 5 10 15 

Pro Asp Pro Ala Leu Ser Cys Ser Asn Ala Ser Thr Leu Ala Pro Leu 
20 25 30 



987 



Pro Ala Pro Leu Ala Val Ala Val Pro Val Val Tyr Ala Val He Cvs 
35 40 45 

Ala val Gly Leu Ala Gly Asn Ser Ala Val Leu Tyr Val Leu Leu Arg 

50 55 - ^ 



60 



Ala Pro Arg Met Lys Thr Val Thr Asn Leu Phe He Leu Asn Leu Ala 

65 70 



75 



80 
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lie Ala Asp Glu Leu Phe Thr Leu Val Leu Pro He Asn He Ala Asp 
85 90 95 

Phe Leu Leu Arg Gin Trp Pro Phe Gly Glu Leu Met Cys Lys Leu He 
100 105 110 

Val Ala He Asp Gin Tyr Asn Thr Phe Ser Ser Leu Tyr Phe Leu Thr 
115 120 125 

Val Met Ser Ala Asp Arg Tyr Leu Val Val Leu Ala Thr Ala Glu Ser 
130 135 140 

Arg Arg Val Ala Gly Arg Thr Tyr Ser Ala Ala Arg Ala Val Ser Leu 
145 150 155 160 

Ala Val Trp Gly He Val Thr Leu Val Val Leu Pro Phe Ala Val Phe 
165 170 175 

Ala Arg Leu Asp Asp Glu Gin Gly Arg Arg Gin Cys Val Leu Val Phe 
180 185 190 

Pro Gin Pro Glu Ala Phe Trp Trp Arg Ala Ser Arg Leu Tyr Thr Leu 
195 200 205 

Val Leu Gly Phe Ala He Pro Val Ser Thr He Cys Val Leu Tyr Thr 
210 215 220 

Thr Leu Leu Cys Arg Leu His Ala Met Arg Leu Asp Ser His Ala Lys 
225 230 235 240 

Ala Leu Glu Arg Ala Lys Lys Arg Val Thr Phe Leu Val Val Ala He 
245 250 255 

Leu Ala Val Cys Leu Leu Cys Trp Thr Pro Tyr His Leu Ser Thr Val 
260 265 270 

Val Ala Leu Thr Thr Asp Leu Pro Gin Thr Pro Leu Val He Ala He 
275 280 285 

Ser Tyr Phe He Thr Ser Leu Thr Tyr Ala Asn Ser Cys Leu Asn Pro 
290 295 300 

Phe Leu Tyr Ala Phe Leu Asp Ala Ser Phe Arg Arg Asn Leu Arg Gin 
305 310 315 320 

Leu He Thr Cys Arg Ala Ala Ala 
325 

(14) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
• (D) TOPOLOGY: linear 
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15 



20 



30 



31 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
C6GAATTCGT CAACGGTCCC AGCTACAATG 

(15) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

ATGGATCCCA GGCCCTTCAG CACCGCAATA T 

(16) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1002 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATGCAGGCCG CTGGGCACCC AGAGCCCCTT GACAGCAGGG GCTCCTTCTC CCTCCCCACG 60 

ATGGGTGCCA ACGTCTCTCA GGACAATGGC ACTGGCCACA ATGCCACCTT CTCCGAGCCA 120 

CTGCCGTTCC TCTATGTGCT CCTGCCCGCC GTGTACTCCG GGATCTGTGC TGTGGGGCTG 180 

ACTGGCAACA CGGCCGTCAT CCTTGTAATC CTAAGGGCGC CCAAGATGAA GACGGTGACC 24 0 

25 AACGTGTTCA TCCTGAACCT GGCCGTCGCC GACGGGCTCT TCACGCTGGT ACTGCCCGTC 300 

AACATCGCGG AGCACCTGCT GCAGTACTGG CCCTTCGGGG AGCTGCTCTG CAAGCTGGTG 360 

CTGGCCGTCG ACCACTACAA CATCTTCTCC AGCATCTACT TCCTAGCCGT GATGAGCGTG 420 

GACCGATACC TGGTGGTGCT 66CCACCGTG AGGTCCCGCC ACATGCCCTG GCGCACCTAC 480 

CGGGGGGCGA AGGTCGCCAG CCTGTGTGTC TGGCTGGGCG TCACGGTCCT GGTTCTGCCC 540 

30 TTCrrCTCTT TCGCTGGCGT CTACAGCAAC GAGCTGCAGG TCCCAAGCTG TGGGCTGAGC 600 

TTCCCGTGGC CCGAGCGGGT CTGGTTCAAG GCCAGCCGTG TCTACACTTT GGTCCTGGGC 660 

TTCGTGCTGC CCGTGTGCAC CATCTGTGTG CTCTACACAG ACCTCCTGCG CAGGCTGCGG 720 
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GCCGTGCGGC TCCGCTCTGG AGCCAAGGCT CTAGGCAAGG CCAGGCGGAA GGTGACCGTC 780 

CTGGTCCTCG TCGTGCTGGC CGTGTGCCTC CTCTGCTGGA CGCCCTTCCA CCTGGCCTCT 840 

GTCGTGGCCC TGACCACGGA CCTGCCCCAG ACCCCACTGG TCATCAGTAT GTCCTACGTC 900 

ATCACCAGCC TCACGTACGC CAACTCGTGC CTGAACCCCT TCCTCTACGC CTTTCTAGAT 960 

GACAACTTCC GGAAGAACTT CCGCAGCATA TTGCGGTGCT GA 1002 
(17) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16; 

Met Gin Ala Ala Gly His Pro Glu Pro Leu Asp Ser Arg Gly Ser Phe 
1 5 10 .15 

Ser Leu Pro Thr Met Gly Ala Asn Val Ser Gin Asp Asn Gly Thr Gly 
20 . 25 30 

His Asn Ala Thr Phe Ser Glu Pro Leu Pro Phe Leu Tyr Val Leu Leu 
35 40 45 

Pro Ala Val Tyr Ser Gly lie Cys Ala Val Gly Leu Thr Gly Asn Thr 
50 55 60 

Ala Val lie Leu Val lie Leu Arg Ala Pro Lys Met Lys Thr Val Thr 
65 70 75 80 

Asn Val Phe lie Leu Asn Leu Ala Val Ala Asp Gly Leu Phe Thr Leu 
85 90 95 

Val Leu Pro Val Asn lie Ala Glu His Leu Leu Gin Tyr Trp Pro Phe 
100 105 110 

Gly Glu Leu Leu Cys Lys Leu Val Leu Ala Val Asp His Tyr Asn lie 
115 120 • 125 

Phe Ser Ser lie Tyr Phe Leu Ala Val Met Ser Val Asp Arg Tyr Leu 
130 135 140 

Val Val Leu Ala Thr Val Arg Ser Arg His Met Pro Trp Arg Thr Tyr 
145 150 155 160 

Arg Gly Ala Lys Val Ala Ser Leu Cys Val Trp Leu Gly Val Thr Val 
165 170 175 



^^^^^^^^^ PCT/US99/23938 

14 

Leu Val Leu Pro Phe Phe Ser Phe Ala Gly Val Tyr Ser Asn Glu Leu 
180 185 

Gin Val Pro Ser Cys Gly Leu Ser Phe Pro Trp Pro Glu Arg Val Trp 

200 205 

Phe Lys Ala Ser Arg Val Tyr Thr Leu Val Leu Gly Phe Val Leu Pro 
210 215 220 

Val Cys Thr He Cys Val Leu Tyr Thr Asp Leu Leu Arg Arg Leu Arg 

230 235 240 

Ala Val Arg Leu Arg Ser Gly Ala Lys Ala Leu Gly Lys Ala Arg Arg 
245 250 255 

Lys Val Thr Val Leu Val Leu Val Val Leu Ala Val Cys Leu Leu Cys 
260 265 270 

Trp Thr Pro Phe His Leu Ala Ser Val Val Ala Leu Thr Thr Asp Leu 
275 280 285 

Pro Gin Thr Pro Leu Val He Ser Met Ser Tyr Val He Thr Ser Leu 
290 295 300 

Thr Tyr Ala Asn Ser Cys Leu Asn Pro Phe Leu Tyr Ala Phe Leu Asd 

310 315 320 

Asp Asn Phe Arg Lys Asn Phe Arg Ser He Leu Arg Cys 

325 330 

(18) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
ACGAATTCAG CCATGGTCCT TGAGGTGAGT GACCACCAAG TGCTAAAT 

(19) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 18: 



48 
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GAGGATCCTG GAATGCGGGG AAGTCAG 27 

(20) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1107 base pairs 

(B) TYPE: nucleic acid 

(C) STRANBEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

ATGGTCCTTG AGGTGAGTGA CCACCAAGTG CTAAATGACG CCGAGGTTGC CGCCCTCCTG 60 

GAGAACTTCA GCTCTTCCTA TGACTATGGA GAAAACGAGA GTGACTCGTG CTGTACCTCC 120 

CCGCCCTGCC CACAGGACTT CAGCCTGAAC TTCGACC6GG CCTTCCTGCC AGCCCTCTAC 180 

AGCCTCCTCT TTCTGCTGGG GCTGCTGGGC AACGGCGCGG TGGCAGCCGT GCTGCTGAGC 240 

CGGCGGACAG CCCTGAGCAG CACCGACACC TTCCTGCTCC ACCTAGCTGT AGCAGACACG 300 

CTGCTGGTGC TGACACTGCC GCTCTGGGCA GTGGACGCTG CCGTCCAGTG GGTCTTTGGC 360 

TCTGGCCTCT GCAAAGTGGC AGGTGCCCTC TTCAACATCA ACTTCTACGC AGGAGCCCTC 420 

CTGCTGGCCT GCATCAGCTT TGACCGCTAC CTGAACATAG TTCATGCCAC CCAGCTCTAC 480 

CGCCGGGGGC CCCCGGCCCG CGTGACCCTC ACCTGCCTGG CTGTCTGGGG GCTCTGCCTG 540 

CTTTTCGCCC TCCCAGACTT CATCTTCCTG TCGGCCCACC ACGACGAGCG CCTCAACGCC 600 

ACCCACTGCC AATACAACTT CCCACAGGTG GGCCGCACGG CTCTGCGGGT GCTGCAGCTG 660 

GTGGCTGGCT TTCTGCTGCC CCTGCTGGTC ATGGCCTACT GCTATGCCCA CATCCTGGCC 720 

GTGCTGCTGG TTTCCAGGGG CCAGCGGCGC CTGCGGGCCA TGCGGCTGGT GGTGGTGGTC 780 

GTGGTGGCCT TTGCCCTCTG CTGGACCCCC TATCACCTGG TGGTGCTGGT GGACATCCTC 840 

ATGGACCTGG GCGCTTTGGC CCGCAACTGT GGCCGAGAAA GCAGGGTAGA CGTGGCCAAG 900 

TCGGTCACCT CAGGCCTGGG CTACATGCAC TGCTGCCTCA ACCCGCTGCT CTATGCCTTT 960 

GTAGGGGTCA AGTTCCGGGA GCGGATGTGG ATGCTGCTCT TGCGCCTGGG CTGCCCCAAC 1020 

CAGAGA6GGC TCCAGAGGCA GCCATCGTCT TCCCGCCGGG ATTCATCCTG GTCTGAGACC 1080 

TCAGAGGCCT CCTACTCGGG CTTGTGA 1107 

(21) INFORMATION FOR SEQ ID NO: 20: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 368 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Val Leu Glu Val Ser Asp His Gin Val Leu Asn Asp Ala Glu Val 
15 10 15 

Ala Ala Leu Leu Glu Asn Phe Ser Ser Ser Tyr Asp Tyr Gly Glu Asn 
20 25 30 

Glu Ser Asp Ser Cys Cys Thr Ser Pro Pro Cys Pro Gin Asp Phe Ser 
35 40 45 

Leu Asn Phe Asp Arg Ala Phe Leu Pro Ala Leu Tyr Ser Leu Leu Phe 
50 55 60 

Leu Leu Gly Leu Leu Gly Asn Gly Ala Val Ala Ala Val Leu Leu Ser 
^5 70 75 80 

Arg Arg Thr Ala Leu Ser Ser Thr Asp Thr Phe Leu Leu His Leu Ala 
85 90 95 

Val Ala Asp Thr Leu Leu Val Leu Thr Leu Pro Leu Trp Ala Val Asp 
100 105 110 

Ala Ala Val Gin Trp Val Phe Gly Ser Gly Leu Cys Lys Val Ala Gly 
115 120 125 

Ala Leu Phe Asn He Asn Phe Tyr Ala Gly Ala Leu Leu Leu Ala Cys 
130 135 140 

He Ser Phe Asp Arg Tyr Leu Asn He Val His Ala Thr Gin Leu Tyr 
145 150 155 160 

Arg Arg Gly Pro Pro Ala Arg Val Thr Leu Thr Cys Leu Ala Val Trp 
165 170 175 

Gly Leu Cys Leu Leu Phe Ala Leu Pro Asp Phe He Phe Leu Ser Ala 
180 185 190 

His His Asp Glu Arg Leu Asn Ala Thr His Cys Gin Tyr Asn Phe Pro 
195 200 205 

Gin Val Gly Arg Thr Ala Leu Arg Val Leu Gin Leu Val Ala Gly Phe 
210 215 220 

Leu Leu Pro Leu Leu Val Met Ala Tyr Cys Tyr Ala His He Leu Ala 
225 230 235 240 
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Val Leu Leu Val Ser Arg Gly Gin Arg Arg Leu Arg Ala Met Arg Leu 
245 250 255 

Val Val Val Val Val Val Ala Phe Ala Leu Cys. Trp Thr Pro Tyr His 
260 265 , 270 

Leu Val Val Leu Val Asp lie Leu Met Asp Leu Gly Ala Leu Ala Arg 
275 280 285 

Asn Cys Gly Arg Glu Ser Arg Val Asp Val Ala Lys Ser Val Thr Ser 
290 295 300 

Gly Leu Gly Tyr Met His Cys Cys Leu Asn Pro Leu Leu Tyr Ala Phe 
305 310 315 320 

Val Gly Val Lys Phe Arg Glu Arg Met Trp Met Leu Leu Leu Arg Leu 
325 330 335 

Gly Cys Pro Asn Gin Arg Gly Leu Gin Arg Gin Pro Ser Ser Ser Arg 
340 ' 345 350 

Arg Asp Ser Ser Trp Ser Glu Thr Ser Glu Ala Ser Tyr Ser Gly Leu 
355 360 365 

(22) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30. base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE- TYPE : DNA genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:21: 
TTAAGCTTGA CCTAATGCCA TCTTGTGTCC 30 

(23) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TTGGATCCAA AAGAACCATG CACCTCAGAG 30 

(24) INFORMATION FOR SEQ ID NO: 23: 
(i) SEQUENCE CHARACTERISTICS: 
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<A) LENGTH: 1074 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear. 

(ii) MOLECULE TYPE: DMA (genomic) 
(xi) SEQOENCE DESCRIPTION: SEQ ID NO: 23: 

AT6GCTGATG ACTATGGCTC TGAATCCACA TCTTCCATGG AAGACTACGT TAACTTCAAC 60 

TTCACTGACT TCTACTGTGA GAAAAACAAT GTCAGGCAGT TTGCGAGCCA TTTCCTCCCA 120 

CCCTTGTACT GGCTCGTGTT CATCGTGGGT GCCTTGGQCA ACAGTCTTST TATCCTTGTC 180 

TACTGGTACT GCACAA6AGT GAAGACCATG ACCGACATGT TCCTTTTGAA TTTGGCAATT 240 

GCTGACCTCC TCTTTCTTGT CACTCTTCCC TTCTGGGCCA TTGCTGCTGC TGACCAGTGG 300 

AAGTTCCAGA CCTTCAT6TG CAAGGTGGTC AACAGCATGT ACAAGATGAA CTTCTACA6C 360 

TGTQTGTTGC TGATCATGTG CATCAGCGTG GACAGGTACA TTGCCATTGC CCAGGCCATG 420 

AGAQCACATA CTTGGAGGGA GAAAAGGCTT TTGTACAGCA AAATGGTTTG CTTTACCATC 480 

TGGGTATTGG CAGCTGCTCT CTGCATCCCA GAAATCTTAT ACAGCCAAAT CAAGGAGGAA 540 

TCCGGCATTG CTATCTGCAC CATGGTTTAC CCTAGCGATG AGAGCACCAA ACTGAAGTCA 600 

GCTGTCTTGA CCCTGAAGGT CATTCTGGGG TTCTTCCTTC CCTTCGTGGT CATGGCTTGC 660 

TGCTATACCA TCATCATTCA CACCCTGATA CAAGCCAAGA AGTCTTCCAA GCACAAAGCC 720 

CTAAAAGTGA CCATCACTGT CCTGACCGTC TTTGTCTTGT CTCAGTTTCC CTACAACTGC 780 

ATTTTGTTGG TGCAOACCAT TGACGCCTAT GCCATGTTCA TCTCCAACTG TGCCQTTTCC 840 

ACCAACATTG ACATCTGCTT CCAGGTCACC CAGACCATCG CCTTCTTCCA CAGTTGCCTG 900 

AACCCTGTTC TCTATGTTTT TGTGGGTGAG AGATTCCGCC GGGATCTCGT GAAAACCCTG 960 

AAGAACTTGG GTTGCATCAG CCAGGCCCAG TGGGTTTCAT TTACAAGGAG AGAGGGAAGC 1020. 

TTGAAGCTGT CGTCTATGTT GCTGGAGACA ACCTCAGGAG CACTCTCCCT CTGA 1074 
(25) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not releveuit 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Ala Asp Asp Tyr Gly Ser Glu Ser Thr Ser Ser Met Glu Asp Tyr 
1 5 10 15 

Val Asn Phe Asn Phe Thr Asp Phe Tyr Cys Glu Lys Asn Asn Val Arg 
20 25 30 

Gin Phe Ala Ser His Phe Leu Pro Pro Leu Tyr Trp Leu Val Phe lie 
35 40 45 

Val Gly Ala Leu Gly Asn Ser Leu Val lie Leu Val Tyr Trp Tyr Cys 
50 55 60 

Thr Arg Val Lys Thr Met Thr Asp Met Phe Leu Leu Asn Leu Ala lie 
65 70 75 80 

Ala Asp Leu Leu Phe Leu Val Thr Leu Pro Phe Trp Ala lie Ala Ala 
85 90 95 

Ala Asp Gin Trp Lys Phe Gin Thr Phe Met Cys Lys Val Val Asn Ser 
100 105 110 

Met Tyr Lys Met Asn Phe Tyr Ser Cys Val Leu Leu lie Met Cys lie 
115 120 125 

Ser Val Asp Arg Tyr lie Ala lie Ala Gin Ala Met Arg Ala His Thr 
130 135 140 

Trp Arg Glu Lys Arg Leu Leu Tyr Ser Lys Met Val Cys Phe Thr lie 
145 150 155 160 

Trp Val Leu Ala Ala Ala Leu Cys lie Pro Glu lie Leu Tyr Ser Gin 
165 170 175 

lie Lys Glu Glu Ser Gly lie Ala lie Cys Thr Met Val Tyr Pro Ser 
180 185 190 

Asp Glu Ser Thr Lys Leu Lys Ser Ala Val Leu Thr Leu Lys Val He 
195 200 205 

Leu Gly Phe Phe Leu Pro Phe Val Val Met Ala Cys Cys Tyr Thr He 
210 215 220 

He He His Thr Leu He Gin Ala Lys Lys Ser Ser Lys His Lys Ala 
225 230 235 240 

Leu Lys Val Thr He Thr Val Leu Thr Val Phe Val Leu Ser Gin Phe 
245 250 255 

Pro Tyr Asn Cys He Leu Leu Val Gin Thr He Asp Ala Tyr Ala Met 
260 265 270 



Phe He Ser Asn Cys Ala Val Ser Thr Asn He Asp He Cys Phe Gin 
275 280 285 
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Val Thr Gin Thr lie Ala Phe Phe His Ser Cys Leu Asn Pro Val Leu 
290 295 300 

Tyr Val Phe Val Gly Glu Arg Phe Arg Arg Asp Leu Val Lys Thr Leu 
305 310 



320 



Lys Asn Leu Gly Cys He Ser Gin Ala Gin Trp Val Ser Phe Thr Arg 
325 • 330 

Arg Glu Gly Ser Leu Lys Leu Ser Ser Met Leu Leu Glu Thr Thr Ser 
340 345 350 

Gly Ala Leu Ser Leu 
355 

(26) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1110 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
~ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

ATGGCCTCAT CGACCACTCG GGGCCCCAGG GTTTCTGACT TATTTTCTSG GCTGCCGCCG 60 

GCGGTCACAA CTCCC6CCAA CCAGAGCGCA GAGGCCTCGG CGGGCAACGG 'GTCGGTGGCT 120 

GGCGCGGACG CTCCAGCCGT CACGCCCTTC CAGAGCCTGC AGCTGGTGCA TCAGCTGAAG 180 

GGGCTGATCG TGCTGCTCTA CAGCGTCGTG GTGGTCGTGG GGCTGGTGGG CAACTGCCTG 240 

CTGGTGCTGG TGATCGCGCG GGTGCCGCGG CTGCACAACG TGACGAACTT CCTCATCGGC 300 

AACCTGGCCT TGTCCGACGT GCTCATGTGC ACCGCCTGCG TGCCGCTCAC GCTGGCCTAT 3 GO 

GCCTTCGAGC CACGCGGCTG GGTGTTCGGC GGCGGCCTGT GCCACCTGGT CTTCTTCCTG 420 

CAGCCG6TCA CCGTCTATGT GTCGGTGTTC ACGCTCACCA CCATCGCAGT GGACCGCTAC 480 

GTCGTGCTGG TGCACCCGCT GA6GCGCGCA TCTCGCT6C6 CCTCAGCCTA CGCTGTGCTG 540 

GCCATCTGGG CGCTGTCCGC GGTGCTGGCG CTGCCGCCC6 CCGTGCACAC CTATCACGTG 600 

GAGCTCAAGC CGCACGACGT GCGCCTCTGC GAGGAGTTCT GGGGCTCCCA GGAGCGCCAG 660 

CGCCAGCTCT ACGCCTGGGG GCTGCTGCTG GTCACCTACC TGCTCCCTCT GCTGGTCATC 720 

CTCCTGTCTT ACGTCCGGGT GTCAGTGAAG CTCCGCAACC GCGTGGTCCC GGGCTCCGTG 780 

ACCCAGAGCC AGGCCGACTG GGACCGCGCT C6GCGCCGGC 6CACCTTCTG CTTGCTGGTG 840 
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GTGGTCGTGG TGGTGTTCGC CGTCTGCTGG CTGCCGCTGC ACGTCTTCAA CCTGCTGCGG 900 
GACCTCGACC CCCACGCCAT CGACCCTTAC GCCTTTGGGC TGGTGCAGCT GCTCTGCCAC 960 
TGGCTCGCCA TGAGTTCGGC CTGCTACAAC CCCTTCATCT ACGCCTGGCT GCACGACAGC 1020 
TTCCGCGAGG AGCTGCGCAA ACTGTTGGTC GCTTGGCCCC GCAAGATAGC CCCCCATGGC 1080 
CAGAATATGA CCGTCAGCGT GGTCATCTGA 1110 
(27) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 369 amino ac ids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 26; 

Met Ala Ser Ser Thr Thr Arg Gly Pro Arg Val Ser Asp Leu Phe Ser 
15 10 15 

Gly Leu Pro Pro Ala Val Thr Thr . Pro Ala Asn Gin Ser Ala Glu Ala 
20 25 30 

Ser Ala Gly Asn Gly Ser Val Ala Gly Ala Asp Ala Pro Ala Val Thr 
35 40 45 

Pro Phe Gin Ser Leu Gin Leu Val His Gin Leu.Lys Gly Leu lie Val 
50 55 60 

Leu Leu Tyr Ser Val Val Val Val Val Gly Leu Val Gly Asn Cys Leu 
65 70 75 80 

Leu Val Leu Val He Ala Arg Val Pro J^g Leu His Asn Val Thr Asn 
85 90 95 

Phe Leu lie Gly Asn Leu Ala Leu Ser Asp Val Leu Met Cys Thr Ala 
100 105 110 

Cys Val Pro Leu Thr Leu Ala Tyr Ala Phe Glu Pro Arg Gly Trp Val 
115 120 125 

Phe Gly Gly Gly Leu Cys His Leu Val Phe Phe Leu Gin Pro Val Thr 
130 135 140 

Val Tyr Val Ser Val Phe Thr Leu Thr Thr He Ala Val Asp Arg Tyr 
145 150 155 160 

Val Val Leu Val His Pro Leu Arg Arg Ala Ser Arg Cys Ala Ser Ala 
165 170 175 
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Tyr Ala Val Leu Ala He Trp Ala Leu Ser Ala Val Leu Ala Leu Pro 
180 185 

Pro Ala val His Thr Tyr His Val Glu Leu Lys Pro His Asp Val Arg 
195 200 205 

Leu Cys Glu Glu Phe Trp Gly Ser Gin Glu Arg Gin Arg Gin Leu Tyr 

215 220 

Ala Trp Gly Leu Leu Leu Val Thr Tyr Leu Leu Pro Leu Leu Val He 

230 235 240 

Leu Leu Ser Tyr Val Arg Val Ser Val Lys Leu Arg Asn Arg Val Val 



245 



250 



255 



Pro Gly cys Val Thr Gin Ser Gin Ala Asp Trp Asp Arg Ala Arg Arg 
260 265 270 

Arg Arg Thr Phe Cys Leu Leu Val Val Val Val Val Val Phe Ala Val 
275 280 285 

Cys Trp Leu Pro Leu His Val Phe Asn Leu Leu Arg Asp Leu Asp Pro 
290 295 

His Ala He Asp Pro Tyr Ala Phe Gly Leu Val Gin Leu Leu Cys His 

310 320 

Trp Leu Ala Met Ser Ser Ala Cys Tyr Asn Pro Phe He Tyr Ala Trp 
325 330 

Leu His Asp Ser Phe Arg Glu Glu Leu Arg Lys Leu Leu Val Ala Trp 
340 345 35Q 

Pro Arg Lys He Ala Pro His Gly Gin Asn Met Thr Val Ser Val Val 
355 360 365 

25 He 



(28) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1083 base pairs 

3" (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQOENCE DESCRIPTION: SEQ ID NO: 27: 
35 ATGGACCCAG AAOAAACTTC A6TTTATTTG GATTATTACT ATCCTACGAG CCCAAACTCT 60 
GACATCAGGG A6ACCCACTC CCATGTTCCT TACACCTCTG TCTTCCTTCC AGTCTTTTAC 120 
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ACAGCTGTGT TCCTGACTGG AGTGCTGGGG AACCTTGTTC TCATGGGAGC GTTGCATTTC 180 

AAACCCGGCA GCCGAAGACT GATCGACATC TTTATCATCA ATCTGGCTGC CTCTGACTTC 240 

ATTTTTCTTG TCACATTGCC TCTCTGGGTG GATAAAGAAG CATCTCTAGG ACTGTGGAGG 300 

ACGGGCTCCT TCCTGTGCAA AGGGAGCTCC TACATGATCT CCGTCAATAT GCACTGCAGT 360 

GTCCTCCTGC TCTVCTTGCAT GAGTGTTGAC CGCTACCTGG CCATTGTGTG GCCAGTCGTA 420 

TCCAGGAAAT TCAGAAGGAC AGACTGTGCA TATGTAGTCT GTGCCAGCAT CTGGTTTATC 480 

TCCTGCCTGC TGGGGTTGCC TACTCTTCTG TCCAGGGAGC TCACGCTGAT TGATGATAAG 540 

CCATACTGTG CAGAGAAAAA GGCAACTCCA ATTAAACTCA TATGGTCCCT GGTGGCCTTA 600 

ATTTTCACCT TTTTTGTCCC TTTGTTGAGC ATTGTGACCT GCTACTGTTG CATTGCAAGG 660 

AAGCtGTGTG CCCATTACCA GCAATCAGGA AAGCACAACA AAAAGCTGAA GAAATCTATA 720 

AAGATCATCT TTATTGTCGT GGCAGCCTTT CTTGTCTCCT GGCTGCCCTT CAATACTTTC 780 

AAGTTCCTGG CCATTGTCTC TGGGTTGCGG CAAGAACACT ATTTACCCTC AGCTATTCTT 840 

CAGCTTGGTA TGGAGGTGAG TGGACCCTTG GCATTTGCCA ACAGCTGTGT CAACCCTTTC 900 

ATTTACTATA TCTTCGACAG CTACATCCGC CGGGCCATTG TCCACTGCTT GTGCCCTTGC 960 

CTGAAAAACT ATGACTTTGG GAGTAGCACT GAGACATCAG ATAGTCACCT CACTAAGGCT 1020 

CTCTCCACCT TCATTCATGC AGAAGATTTT GCCAGGAGGA GGAAGAGGTC TGTGTCACTC 1080 

TAA 1083 
(29) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 360 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : ' 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

Met Asp Pro Glu Glu Thr Ser Val Tyr Leu Asp Tyr Tyr Tyr Ala Thr 
1 5 10 15 

Ser Pro Asn Ser Asp lie Arg Glu Thr His Ser His Val Pro Tyr Thr 
20 25 30 

Ser Val Phe Leu Pro Val Phe Tyr Thr Ala Val Phe Leu Thr Gly Val 
35 40 . 45 
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Leu Gly Asn Leu Val Leu Met Gly Ala Leu His Phe Lys Pro Gly Ser 

55 60 



Arg Arg Leu He Asp He Phe He He Asn Leu Ala Ala Ser Asp Phe 

He Phe Leu Val Thr Leu Pro Leu Trp Val Asp Lys Glu Ala Ser Leu 
85 90 95 

Gly Leu Trp Arg Thr Gly Ser Phe Leu Cys Lys Gly Ser Ser Tyr Met 
100 105 110 

He ser Val Asn Met His Cys Ser Val Leu Leu Leu Thr Cys Met Ser 

120 125 

val Asp Arg Tyr Leu Ala He Val Trp Pro Val Val Ser Arg Lys Phe 

"5 140 

Arg Arg Thr Asp Cys Ala Tyr Val Val Cys Ala Ser He Trp Phe He 

155 160 . 

ser Cys Leu Leu Gly Leu Pro Thr Leu Leu Ser Arg Glu Leu Thr 



165 170 



Leu 
175 



He ASP ASP Lys Pro Tyr Cys Ala Glu Lys Lys Ala Thr Pro He Lys 
180 185 190 

Leu He Trp Ser Leu Val Ala Leu He Phe Thr Phe Phe Val Pro Leu 
195 200 205 

Leu ser He Val Thr Cys Tyr Cys Cys He Ala Arg Lys Leu Cys Ala 

215 220 

His Tyr Gin Gin Ser Gly Lys His Asn Lys Lys Leu Lys Lys Ser He 

235 . 240 

Lys He He Phe He Val Val Ala Ala Phe Leu Val Ser Trp Leu Pro 
245 250 255 



Phe Asn Thr Phe Lys Phe Leu Ala He Val Ser Gly Leu Arg Gin Glu 
260 265 . 270 

His Tyr Leu Pro Ser Ala He Leu Gin Leu Gly Met Glu Val Ser Gly 
275 280 285 

Pro Leu Ala Phe Ala Asn Ser Cys Val Asn Pro Phe He Tyr Tyr He 

295 300 

Phe ASP ser Tyr He Arg Arg Ala He Val His Cys Leu Cys Pro Cys 

315 320 

Leu Lys Asn Tyr Asp Phe Gly Ser Ser Thr Glu Thr Ser Asp Ser His 
325 330 

Leu Thr Lys Ala Leu Ser Thr Phe He His Ala Glu Asp Phe Ala Arg 
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340 345 350 

Arg Arg Lys Arg Ser Val Ser Leu 
355 360 

(30) INFORMATION FOR SEQ ID NO:29: * 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEPNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CTAGAATTCT GACTCCAGCC AAAGCATGAA T 31 

(31) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTJi: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
GCTGGATCCT AAACAGTCTG CGCTCGGCCT 30 

(32) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1020 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : S ingle 

(D) TOPOLOGY: linear 

(ii) MOLECUIiE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

ATGAATGGCC TTGAAGTGGC TCCCCCAGGT CTGATCACCA ACTTCTCCCT GGCCACGGCA 60 

GAGCAATGTG GCCAGGAGAC GCCACTGGAG AACATGCTGT TCGCCTCCTT CTACCTTCTG 120 

GATTTTATCC TGGCTTTAGT TGGCAATACC CTGGCTCTGT GGCTTTTCAT CCGAGACCAC 180 

AAGTCCGGGA CCCCGGCCAA CGTGTTCCTG ATGCATCTGG CCGTGGCCGA CTTGTCGTGC 240 

GTGCTGGTCC TGCCCACCCG CCTGGTCTAC CACTTCTCTG GGAACCACTG GCCATTTGGG 300 



600 
660 
720 
780 
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GAAATCGCAT GCCGTCTCAC CGGCTTCCTC TTCTACCTCA ACATGTACGC CAGCATCTAC 360 
TTCCTCACCT GCATCAGCGC CGACCGTITC CTGGCCATTG TCCACCCGGT CAAGTCCCTC 420 
AAGCTCCGCA GGCCCCTCTA CGCACACCTG GCCT^T^CCT TCCIGTGGGX GGOXBGTGGCT . 480 
GTGGCCATGG CCCCGCIX3CT GGTGAGCCCA CAGACCGTGC AGACCAACCA CACGGTGGTC 540 
TGCCTGCAGC TGTACCGGGA GAAGGCCTCC CACCATGCCC O^TGTCCCT GGCAGTGGCC 
ITCACCrrcC CGTTCATCAC CACGGTCACC TGCTACCTCC tXSATCATCCG CAGCCTGCGG 
CAGGGCCTGC GTGTGGAGAA GCGCCTCAAG ACCAAGGCAG TCCGCATGAT CGCCATAGTG 
CTGGCCATCT TCCTGGTCTG CTTCGTGCCC TACCACGTCA ACCGCTCCGT CTACGTGCTG 
GACTACCGCA GCCAI^GC CTCCTGCGCC ACCCAGCGCA TCCTGGCCCT GGCAAACCGC 840 
ATCACCTCCT GCCTCACCAG CCTCAACGGG GCACTCGACC CCATCATGTA TTTCTTCGTG 300 
GC-TOAGAAGT TCCGCCACGC CCTGTGCAAC T^CTCTGTG GCAAAAGGCT CAAGGGCCCG 960 
CCCCCCAGCT TCGAAGGGAA AACCAACGAG AGCTCGCTGA GTGCCAAGTC AGAGCTGTGA 1020 
(33) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 339 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Met Asn Gly Leu Glu Val Ala Pro Pro Gly Leu lie Thr Asn Phe Ser 
5 10 15 

Leu Ala Thr Ala Glu Gin Cys Gly Gin Glu Thr Pro Leu Glu Asn Met 
20 25 . 30 

Leu Phe Ala Ser Phe Tyr Leu Leu Asp Phe He Leu Ala Leu Val Gly 

40 45 

Asn Thr Leu Ala Leu Trp Leu Phe He Arg Asp His Lys Ser Gly Thr 

5^ 60 

pro Ala Asn Val Phe Leu Met His Leu Ala Val Ala Asp Leu Ser Cys 

^5 80 

. val Leu val Leu Pro Thr Arg Leu Val Tyr His Phe Ser Gly Asn His 
85 90 95 

Trp Pro Phe Gly Glu He Ala Cys Arg Leu Thr Gly Phe Leu Phe Tyr 
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100 105 110 

Leu Asn Met Tyr Ala Ser He Tyr Phe Leu Thr Cys He Ser Ala Asp 
115 120 125 

Arg Phe Leu Ala He Val His Pro Val Lys Ser Leu Lys Leu Arg Arg 
5 130 135 140 

Pro Leu Tyr Aia His Leu Ala Cys Ala Phe Leu Trp Val Val Val Ala 
145 150 155 160 

Val Ala Met Ala Pro Leu Leu Val Ser Pro Gin Thr Val Gin Thr Asn 
165 170 175 

10 His Thr Val Val Cys Leu Gin Leu Tyr Arg Glu Lys Ala Ser His His 

180 185 190 

Ala Leu Val Ser Leu Ala Val Ala Phe Thr Phe Pro Phe He Thr Thr 
195 200 205 

Val Thr Cys Tyr Leu Leu He He Arg Ser Leu Arg Gin Gly Leu Arg 
15 210 215 220 

Val Glu Lys Arg Leu Lys Thr Lys Ala Val Arg Met He Ala He Val 
225 230 235 240 

Leu Ala He Phe Leu Val Cys Phe Vail Pro Tyr His Val Asn Arg Ser 
245 250 255 

20 Val Tyr Val Leu His Tyr Arg Ser His Gly Ala Ser Cys Ala Thr Gin 

260 265 270 

Arg He Leu Ala Leu Ala Asn Arg He Thr Ser Cys Leu Thr Ser Leu 
275 280 285 

Asn Gly Ala Leu Asp Pro He Met Tyr Phe Phe Val Ala Glu Lys Phe 
25 290 295 300 

Arg His Ala Leu Cys Asn Leu Leu Cys Gly Lys Arg Leu Lys Gly Pro 
305 310 315 320 

Pro Pro Ser Phe Glu Gly Lys Thr Asn Glu Ser Ser Leu Ser Ala Lys 
325 330 335 

30 Ser Glu Leu 

(34) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 29 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



10 



15 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
ATAAGATGAT CACCCTGAAC AATCAAGAT 
(35) INFORMATION FOR SEQ ID NO:34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
TCCGAATTCA TAACATTTCA CTGTTTATAT TGC 
(36) INFORMATION. FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 996 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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33 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

20 ATGATCACCC TGAACAATCA AGATCAACCT GTCACTI^A ACAGCTCACA TCCAGATGAA 60 

TACAAAATTG CAGCCCTTGT CTTCTATAGC TGTATCTTCA TAArPGGATT ATTTGTTAAC 120 

ATCACTGCAT TATGGGTTTT CAGTTGTACC ACCAAGAAGA GAACCACGGT AACCATCTAT X80 

ATOATGAATG TGGCAITAGT GGACITGATA TTTATAATGA CrPTACCCIT TCGAATGT^ 240 

TATTATGCAA AAGATGCATG GCCATTTGGA GAGTACTTCT GCCAGATTAT TGGAGCTCTC 300 

25 ACAGTGTTTT ACCCAAGCAT TGCTTTATGG CTTCTI^CCT rrATTAGTGC 3,0 

ATGGCCATTO TACAGCCGAA GTACGCCAAA GAACTTAAAA ACACGTGCAA AGCCGTGCTG 420 

GCGTGTGl^ GAGTCTGGAT AATGACCCTG ACCACGACCA CCCCTCTGCT ACTGCTCTAT 480 

AAAGACCCAG ATAAAGACTC O.CTCCCGCC ACCTGCCTCA AGATTTCTGA CATCATCTAT 540 

CTAAAAGCTG TGAACGTGCT GAACCTCACT CGACTX3ACAT TOttCTT OATrCCTTTG 600 
30 "CATCAO^ „oGGT.K:TA CTTGGTCAT^ ArrCAT^^^ 

AAGC^C CCAAAGTCAA GGAGAAGTCC ATAAGGATCA TOVTCACGCT GCTGGTGCAG 720 
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GTGCTCGTCT GCTTTATGCC CTTCCACATC TGTTTCGCTT TCCTGATGCT GGGAACGGGG 780 

GAGAACAGTT ACAATCCCTG GGGAGCCTTT ACCACCTTCC TCATGAACCT CAGCACGTGT 840 

CTGGATGTGA TTCTCTACTA CATCGTTTCA AAACAATTTC AGGCTCGAGT CATTAGTGTC 900 

ATGCTATACC GTAATTACCT TCGAAGCCTG CGCAGAAAAA GTTTCCGATC TGGTAGTCTA 960 

AGGTCACTAA GCAATATAAA CAGTGAAATG TTATGA 99g 
(37) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 331 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

. (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

Met lie Thr Leu Asn Asn Gin Asp Gin Pro Val Thr Phe Asn Ser Ser 
1 5 10 15 

His Pro Asp Glu Tyr Lys He Ala Ala Leu Val Phe Tyr Ser Cys He 
20 25 30 

Phe He He Gly Leu Phe Val Asn He Thr Ala Leu Trp Val Phe Ser 
35 40 45 

Cys Thr Thr Lys Lys Arg Thr Thr Val Thr He Tyr Met Met Asn Val 
50 55* 60 

Ala Leu Val Asp Leu He Phe He Met Thr Leu Pro Phe Arg Met Phe 
65 70 75 80 

Tyr Tyr Ala Lys Asp Ala Trp Pro Phe Gly Glu Tyr Phe Cys Gin He 
85 90 95 

He Gly Ala Leu Thr Val Phe Tyr Pro Ser He Ala Leu Trp Leu Leu 
100 105 110 

Ala Phe He Ser Ala Asp Arg Tyr Met Ala He Val Gin Pro Lys Tyr 
115 120 125 

Ala Lys Glu Leu Lys Asn Thr Cys Lys Ala Val Leu Ala Cys Val Gly 
130 135 140 

Val Trp He Met Thr Leu Thr Thr Thr Thr Pro Leu Leu Leu Leu Tyr 
"5 150 155 160 

Lys Asp Pro Asp Lys Asp Ser Thr Pro Ala Thr Cys Leu Lys He Ser 
165 170 175 
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Asp lie lie Tyr Leu Lys Ala Val Asn Val Leu Asn Leu Thr Arg Leu 
180 

Thr Phe Phe Phe Leu lie Pro Leu Phe lie Met lie Gly Cys Tyr Leu 

200 



205 



val lie He His Asn Leu Leu His Gly Arg Thr Ser Lys Leu Lys Pro 

215 



220 



10 



Lys val Lys Glu Lys Ser He Arg lie lie He Thr Leu Leu Val Gin 

235 240 
val Leu val Cys Phe Met Pro Phe His He Cys Phe Ala Phe Leu Met 
245 250 255 

Leu Gly Thr Gly Glu Asn Ser Tyr Asn Pro Trp Gly Ala Phe Thr Thr 
260 265 270 

Phe Leu Met Asn Leu Ser Thr Cys Leu Asp Val He Leu Tyr Tyr He 
275 280 



285 



15 



val ser Lys Gin Phe Gin Ala Arg Val He Ser Val Met Leu Tyr Arg 
Asn Tyr Leu Arg Ser Leu Arg Arg Lys Ser Phe Arg Ser Gly Ser Leu 



320 



Arg ser Leu Ser Asn He Asn Ser Glu Met Leu 

(38) INFORMATION FOR SEQ ID NO:3'7: 

(i) SEQUENCE CH/UyvCTERISTICS : 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
CCAAGCTTCC AG6CCTGGGG TGTGCTGG 
(39) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



28 
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ATGGATCCTG ACCTTCGGCC CCTGGCAGA 29 
(40) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHTU^CTERISTICS : 

(A) LENGTH: 1077 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

ATGCCCTCTG TGTCTCCAGC GGGGCCCTCG GCCG6GGCAG TCCCCAATGC CACCGCAGTG 60 

ACAACAGTGC GGACCAATGC CAGCGGGCTG GAGGTGCCCC TGTTCCACCT GTTTGCCCGG 120 

CTGGACGAGG AGCTGCATGG CACCTTCCCA GGCCTGTGCG TGGCGCTGAT GGCGGTGCAC 180 

GGAGCCATCT TCCTGGCAGG GCTGGTGCTC AACGGGCTGG CGCTGTACGT CTTCTGCTGC 240 

CGCACCCGGG CCAAGACACC CTCAGTCATC TACACCATCA ACCTGGTGGT GACCGATCTA 300 

CTGGTAGGGC TGTCCCTGCC CACGCGCTTC GCTGTGTACf ACGGCGCCAG GGGCTGCCTG 360 

CGCTGTGCCT TCCCGCACGT CCTCGGTTAC TTCCTCAACA TGCACTGCTC CATCCTCTTC 420 

CTCACCTGCA TCTGCGTGGA CCGCTACCTG GCCATCGTGC GGCCCGAAGG CTCCCGCCGC 480 

TGCCGCCAGC CTGCCTGTGC CAGGGCCGTG TGCGCCTTCG" TGTGGCTGGC CGCCGGTGCC 540 

GTCACCCTGT CGGTGCTGGG CGTGACAGGC AGCCGGCCCT GCTGCCGTGT CTTTGCGCTG 600 

ACTGTCCTGG AGTTCCTGCT GCCCCTGCTG GTCATCAGCG TGTTTACCGG CCGCATCATG 660 

TGTGCACTGT CGCGGCCGGG TCTGCTCCAC CAGGGTCGCC AGCGCCGCGT GCGGGCCATG 720 

CAGCTCCTGC TCACGGTGCT CATCATCTTT CTCGTCTGCT TCACGCCCTT CCACGCCCGC 780 

CAAGTGGCCG TGGCGCTGTG GCCCGACATG CCACACCACA CGAGCCTCGT GGTCTACCAC 840 

GTGGCCGTGA CCCTCAGCAG CCTCAACAGC TGCATGGACC CCATCGTCTA CTGCTTCGTC 900 

ACCAGTGGCT TCCAGGCCAC CGTCCGAGGC CTCTTCGGCC AGCACGGAGA GCGTGAGCCC 960 

AGCAGCGGTG ACGTGGTCAG CATGCACAGG AGCTCCAAGG GCTCAGGCCG TCATCACATC 1020 

CTCAGTGCCG GCCCTCACGC CCTCACCCAG GCCCTGGCTA ATGGGCCCGA GGCTTAG 1077 
(41) HJFORMATION FOR SEQ ID NO:40: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 358 amino acids 



wo 00/22129 



PCT/US99y23938 



32 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOIiECDLE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:40; 
Met Pro Ser Val Ser Pro Ala Gly Pro Ser Ala Gly Ala Val Pro Asn 

Ala Thr Ala Val Thr Thr Val Arg Thr Asn Ala Ser Gly Leu Glu Val 

2° . 30 , 

Pro Leu Phe His Leu Phe Ala Arg Leu Asp Glu Glu Leu His Gly Thr 
35 40 45 

Phe Pro Gly Leu Cys Val Ala Leu Met Ala Val His Gly Ala He Phe 
^° 55 go 

Leu Ala Gly Leu Val Leu Asn Gly Leu Ala Leu Tyr Val Phe Cys Cys 
®^ 7° 75 ■ 80 

Arg Thr Arg Ala Lys Thr Pro Ser Val He Tyr Thr He Asn Leu Val 
85 90 95 

Val Thr Asp Leu Leu Val Gly Leu Ser Leu Pro Thr Arg Phe Ala Val 

105 

Tyr Tyr Gly Ala Arg Gly Cys Leu Arg Cys Ala Phe Pro His Val Leu 
115 120 125 

Gly Tyr Phe Leu Asn Met His Cys Ser He Leu Phe Leu Thr Cys lie 
130 135 140 



Cys val Asp Arg Tyr Leu Ala He Val Arg Pro Glu Ala Pro Ala Ala 

"° 155 160 

Cys Arg Gin Pro Ala Cys Ala Arg Ala Val Cys Ala Phe Val Trp Leu 
165 170 175 

Ala Ala Gly Ala Val Thr Leu Ser Val Leu Gly Val Thr Gly Ser Arg 



180 185 



190 



Pro Cys Cys Arg Val Phe Ala Leu Thr Val Leu Glu Phe Leu Leu Pro 
"5 200 . 205 

Leu Leu Val He Ser Val Phe Thr Gly Arg He Met Cys Ala Leu Ser 

215 220 



Arg Pro Gly Leu Leu His Gin Gly Arg Gin Arg Arg Val Arg Ala Met 
• 230 235 240 

Gin Leu Leu Leu Thr Val Leu He He Phe Leu Val Cys Phe Thr Pro 

255 



245 250 
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Phe His Ala Arg Gin Val Ala Val Ala Leu Trp Pro Asp Met Pro His 
260 265 270 

His Thr Ser Leu Val Val Tyr His Val Ala Val Thr Leu Ser Ser Leu 
275 280 285 

Asn Ser Cys Met Asp Pro lie Val Tyr Cys Phe Val Thr Ser Gly Phe 
290 295 300 

Gin Ala Thr Val Arg Gly Leu Phe Gly Gin His Gly Glu Arg Glu Pro 
305 310 315 320 

Ser Ser Gly Asp Val Val Ser Met His Arg Ser Ser Lys Gly Ser Gly 
325 330 335 

Arg His His lie Leu Ser Ala Gly Pro His Ala Leu Thr Gin Ala Leu 
340 345 350 

Ala Asn Gly Pro Glu Ala 
355 

(42) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41: 
GAGAATTCAC TCCTGAGCTC AAGATGAACT 3q 

(43) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHTOIACTERISTICS : 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
CGGGATCCCC GTAACTGAGC CACTTCAGAT 30 

(44) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1050 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQtJENCE DESCRirTION: SEQ ID NO: 43: 
ATGAACTCCA CCTTGGATGG TAATCAGAGC AGCCACCCTT TTTOCCTCTT GGCATTT6GC 60 
5 TAOTTGGAAA CTGTCAATTT TTGCCTTTTG GAAGTATTGA TTATTGTCTT T^^ 120 
TTGATTATTT CTGGCAACAT CATTGTGATT TTTGTATTTC ACTCTOCACC rrTGTTCAAC 180 
CATCACACTA CAAGTTATTT TATCCAGACT A-TCGCATATG CTGACCTTTT TGTTGGGGTG 240 
AGCTGCGTGG TCCCTTCTTT ATCACTCCTC CATCACCCCC TTCCAGTAGA GGAGTCCTTG 300 
ACITGCCAGA TATTTGGTTT TGTAGTATCA GTTCTGAAGA GCGTCTCCAT GGCTTCTCTG 360 

0 GCCTGTATCA GCATTGATAG ATACATTGCC ATTACTAAAC CTTTAACCTA TAATACTCTG 420 

GTTACACCCT GGAGACTACG CCTGTGTATT TTCCTGATTT GGCTATACTC GACCCTGGTC 480 

TTCCTGCCTT CCTTTTTCCA CTCGGGCAAA CCTGGATATC ATGGAGATGT GTTTCAGTGG 540 

TGTGCGGAGT CCTGGCACAC CGACTCCTAC TTCACCCTCT TCATCGTGAT 6ATGTTATAT 600 

GCCCCAGCAG CCCTTATTGT CTGCTTCACC TATTTCAACA TCTTCCGCAT CTGCCAACAG 660 

CACACAAAGG ATATCAGCGA AAGGCAAGCC CGCTTCAGCA GCCAGAGTGG GGAGACTGGG 720 

GAAGTGCAGG CCTGTCCTGA TAAGCGCTAT GCCATGGTCC TGTTTCGAAT CACTAGTGTA 780 

TTTTACATCC TCTGGTTGCC ATATATCATC TACTTCTTGT TG6AAAGCTC CACTGGCCAC 840 

AGCAACCGCT TCGCATCCTT CTT6ACCACC TGGCTTGCTA TTAGTAACAG TTTCTGCAAC 900 

TGTGTAATTT ATAGTCTCTC CAACAGTGTA TTCCAAAGAG GACTAAAGCG CCTCTCAGGG 960 

GCTATGTGTA CTTCTTGTGC AAGTCAGACT ACAGCCAACG ACCCTTACAC AGTTAOAAGC 1020 
AAAGGCCCTC TTAATGGATG TCATATCTGA 

1050 

(45) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 349 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECTJLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
Met Asn Ser Thr Leu Asp Gly Asn Gin Ser Ser His Pro Phe Cys Leu 
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10 



15 



Leu Ala Phe Gly Tyr Leu Glu Thr Val Asn Phe Cys Leu Leu Glu Val 
20 25 30 

Leu He He Val Phe Leu Thr Val Leu He He Ser Gly Asn He He 
35 40 45 

Val He Phe Val Phe His Cys Ala Pro Leu Leu Asn His His Thr Thr 
50 55 60 

Ser Tyr Phe He Gin Thr Met Ala Tyr Ala Asp Leu Phe Val Gly Val 
^5 70 75 80 

Ser Cys Val Val Pro Ser Leu Ser Leu Leu His His Pro Leu Pro Val 
85 90 95 

Glu Glu Ser Leu Thr Cys Gin He Phe Gly Phe Val Val Ser Val Leu 
100 105 110 

Lys Ser Val Ser Met Ala Ser Leu Ala Cys He ,Ser He Asp Arg Tyr 
115 120 125 

He Ala He Thr Lys Pro Leu Thr Tyr Asn Thr Leu Val Thr Pro Trp 
130 135 140 

Arg Leu Arg Leu Cys He Phe Leu He Trp Leu Tyr Ser Thr Leu Val 

150 155 160 

Phe Leu Pro Ser Phe Phe His Trp Gly Lys Pro Gly Tyr His Gly Asp 
165 170 175 

Val Phe Gin Trp Cys Ala Glu Ser Trp His Thr Asp Ser Tyr Phe Thr 
180 185 190 

Leu Phe He Val Met Met Leu Tyr Ala Pro Ala Ala Leu He Val Cys 
195 200 205 

Phe Thr Tyr Phe Asn He Phe Arg He Cys Gin Gin His Thr Lys Asp 
210 215 220 

He Ser Glu Arg Gin Ala Arg Phe Ser Ser Gin Ser Gly Glu Thr Gly 
225 230 235 240 

Glu Val Gin Ala Cys Pro Asp Lys Arg Tyr Ala Met Val Leu Phe Arg 
245 250 255 

He Thr Ser Val Phe Tyr He Leu Trp Leu Pro Tyr He He Tyr Phe 
260 265 270 

Leu Leu Glu Ser Ser Thr Gly His Ser Asn Arg Phe Ala Ser Phe Leu 
275 280 285 



Thr Thr Trp Leu Ala He Ser Asn Ser Phe Cys Asn Cys Val He Tyr 
290 295 300 
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Ser I^u Ser Asn Ser Val Phe Gin Arg Gly Leu Lys Arg Leu Ser Gly 

310 



320 



10 



Ala Met Cys Thr Ser Cys Ala Ser Gin Thr Thr Ala Asn Asp Pro Tyr 

- 330 

Thr Val Arg Ser Lys Gly Pro Leu Asn Gly Cys His He 
340 345 

(46) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic). 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45; 
15 TCCCCCGGGA AAAAAACCAA CTGCTCCAAA 

(47) INFORMATION FOR SEQ ID NO: 46: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



30 



31 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ IDN0:46: 
TAGGATCCAT TTGAATGTGG ATTTGGTGAA A 
25 (48) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1302 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear. 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

ATCTOTTTrr CTCCCArrCT GGAAATCAAC ATCCAGtCTG AATCTAACAT TACAGTCCGA 60 

GATGACATTG ATGACATCAA CACCAATATG TACCAACCAC TATCATATCC GTTAAGCTTT 120 

35 CAAGTGTCTC TCACCGGATT TCTTATGTTA 6AAATTGTGT TGGGACTTGG CAGCAACCTC 180 
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ACTGTATTGG TACTTTACTG CATGAAATCC AACTTAATCA ACTCTGTCAG TAACATTATT 240 

AC7VATGAATC TTCATGTACT TGATGTAATA ATTTGTGTGG GATGTATTCC TCTAACTATA 300 

GTTATCCTTC TGCTTTCACT GGAGAGTAAC ACTGCTCTCA TTTGCTGTTT CCATGAGGCT 360 

TGTGTATCTT TTGCAAGTGT CTCAACAGCA ATCAACGTTT TTGCTATCAC TTTGGACAGA 420 

TATGACATCT CTGTAAAACC TGCAAACC6A ATTCTGACAA TGGGCAGAGC TGTAATGTTA 480 

ATGATATCCA TTTGGATTTT TTCTTTTTTC TCTTTCCTGA TTCCTTTTAT TGAGGTAAAT 540 

TTTTTCAGTC TTCAAAGTGG AAATACCTGG GAAAACAAGA CACTTTTATG TGTCAGTACA 600 

AATGAATACT ACACTGAACT GGGAATGTAT TATCACCTGT TAGTACAGAT CCCAATATTC 660 

TTTTTCACTG TTGTAGTAAT GTTAATCACA TACACCAAAA TACTTCAGGC TCTTAATATT 720 

CGAATAGGCA CAAGATTTTC AACAGGGCAG AAGAAGAAAG CAAGAAAGAA AAAGACAATT 780 

TCTCTAACCA CACAACATGA GGCTACAGAC ATGTCACAAA GCAGTGGTGG GAGA7VATGTA 840 

GTCTTTGGTG TAAGAACTTC AGTTTCTGTA ATAATTGCCC TCCGGCGAGC TGTGAAACGA 900 

CACCGTGAAC GACGAGAAAG ACAAAAGAGA GTCTTCAGGA TGTCTTTATT GATTATTTCT 960 

ACATTTCTTC TCTGCTGGAC ACCAATTTCT GTTTTAAATA CCACCATTTT ATGTTTAGGC 1020 

CCAAGTGACC TTTTAGTAAA ATTAAGATTG TGTTTTTTAG TCATGGCTTA TGGAACAACT 1080 

ATATTTCACC CTCTATTATA TGCATTCACT AGACAAAAAT TTCAAAAGGT CTTGAAAAGT 1140 

AAAATGAAAA AGCGAGTTGT TTCTATAGTA GAAGCTGATC CCCTGCCTAA TAATGCTGTA 1200 

ATACACAACT CTTGGATAGA TCCCAAAAGA AACAAAAAAA TTACCTTTGA AGATAGTGAA 1260 

ATAAGAGAAA AACGTTTAGT GCCTCAGGTT GTCACAGACT AG 1302 
(49) INFORMATION FOR SEQ ID N0:48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 433 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Met Cys Phe Ser Pro lie Leu Glu He Asn Met Gin Ser Glu Ser Asn 
15 10 15 

He Thr Val Arg Asp Asp He Asp Asp He Asn Thr Asn Met Tyr Gin 
20 25 30 
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Pro Leu Ser Tyr Pro Leu Ser Phe Gin Val Ser Leu Thr Gly Phe Leu 
35 40 45 

Met Leu Glu He Val Leu Gly Leu Gly Ser Asn Leu Thr Val Leu Val 
50 55 60 

5 Leu Tyr Cys Met Lys Ser Asn Leu He Asn Ser Val Ser Asn He He 

65 70 75 80 

Thr Met Asn Leu His Val Leu Asp Val He He Cys Val Gly Cys He 
85 90 95 

Pro Leu Thr He Val He Leu Leu Leu Ser Leu Glu Ser Asn Thr Ala 
10 100 105 110 

Leu He Cys Cys Phe His Glu Ala Cys Val Ser Phe Ala Ser Val Ser 
115 120 125 



15 



Thr Ala He Asn Val Phe Ala He Thr Leu Asp Arg Tyr Asp He Ser 
130 135 140 

Val Lys Pro Ala Asn Arg He Leu Thr Met Gly Arg Ala Val Met Leu 
145 150 155 160 



Met He Ser He Trp He Phe Ser Phe Phe Ser Phe Leu He Pro Phe 
165 170 175 

He Glu Val Asn Phe Phe Ser Leu Gin Ser Gly Asn- Thr Trp Glu Asn 
20 180 185 190 

Lys Thr Leu Leu Cys Val Ser Thr Asn Glu Tyr Tyr Thr Glu Leu Gly 
195 200 205 



25 



Met Tyr Tyr His Leu Leu Val Gin He Pro He Phe Phe Phe Thr Val 

210 215 220 

Val Val Met Leu He Thr Tyr Thr Lys He Leu Gin Ala Leu Asn He 

225 230 235 240 



Arg He Gly Thr Arg Phe Ser Thr Gly Gin Lys Lys Lys Ala Arg Lys 
245 . 250 255 



30 



Lys Lys Thr He Ser Leu Thr Thr Gin His Glu Ala Thr Asp Met Ser 
260 265 270 



Gin Ser Ser Gly Gly Arg Asn Val Val Phe Gly Val Arg Thr Ser Val 
275 280 285 



Ser Val He He Ala Leu Arg Arg Ala Val Lys Arg His Arg Glu Arg 

290 295 300 

35 Arg Glu Arg Gin Lys Arg Val Phe Arg Met Ser I,eu Leu He He Ser 

305 310 315 320 

Thr Phe Leu Leu Cys Trp Thr Pro He Ser Val Leu Asn Thr Thr He 
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325 330 335 

Leu Cys Leu Gly Pro Ser Asp Leu Leu Val Lys Leu Arg Leu Cys Phe 
340 345 350 

Leu Val Met Ala Tyr Gly Thr Thr lie Phe His Pro Leu Leu Tyr Ala 
355 360 365 

Phe Thr Arg Gin Lys Phe Gin Lys Val Leu Lys Ser Lys Met Lys Lys 
370 375 380 

Arg Val Val Ser lie Val Glu Ala Asp Pro Leu Pro Asn Asn Ala Val 
385 390 395 4OO 

lie His Asn Ser Trp He Asp Pro Lys Arg Asn Lys Lys He Thr Phe 
405 410 415 ' 

Glu Asp Ser Glu He Arg Glu Lys Arg Leu Val Pro Gin Val Val Thr 
420 425 430 

Asp 



(50) INFORMATION FOR SEQ ID NO: 49: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
GTGAAGCTTG CCTCTGGTGC CTGCAGGAGG 
25 (51) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

GCAGAATTCC CGGTGGCGTG TTGTGGTGCC C 

(52) INFORMATION FOR SEQ ID NO: 51: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1209 base pairs 



30 



31 
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(B) TYPE: nucleic acid 

(C) STRANDBDNESS : single 

(D) TOPOUXSY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

ATOrrOTGTC CTTCCAAGAC AGATGQCTCA GGGCACTCTC GTAGGATTCA CCAGGAAACT 60 

CATGGAGAAG GGAAAAGGGA CAAGATTAGC AACAGl^G GGAG<^GAA TGGTGC^^ 120 
GGATTCCAGA TGAACGGTGG GTCGCTGGAG GCTGAGCATG CCAGCAGGAT GTCAGTTCTC 180 
AGAGCAAAGC CCATGTCAAA CAGCCAACGC rPGCTCCTTC TGTCCCCAGG ATCACCTCCT 240 

10 CGCACGGGGA GCATCTCCTA CATCAACATC ATCATGCCTT CGGTGTTCGG CACCATCTCC 300 

CTCCTGGGCA TCATCGGGAA CTCCACGGTC ATCTTCGCGG TCGTGAAGAA GTCCAAGCTG 360 

CACTCGTGCA ACAACGTCCC CGACATCTTC ATCATCAACC TCTCGGTAGT AGATCTCCTC 420 

TTTCTCCTOG GCATGCCCTT CATOATCCAC CAGCTCATGG GCAATGGGGT GTGGCACTTT 480 

GGGGAGACCA TGTGCACCCT CATCACGGCC AIX^GATGCCA ATAGTCAGTT CACCAGCACC 540 

15 TACATCCTGA CCGCCATGGC CATTGACCGC TACCTGGCCA CTGTCCACCC CATCTCTTCC 600 

ACGAAGTTCC GGAAGCCCTC TGTGGCCACC CKOTGATCT GCCTCCTGTC GGCCCTCTCC 660 

TTCATCAGCA TCACCCCTGT GTCGCTGTAT GCCAGACTCA TCCCCTTCCC AGGAGGTGCA 720 

GTOGGCTGCG GCATACGCCT GCCCAACCCA GACACTGACC TCTAC-^GTT CACCCTGTAC 780 

CAGrrTTTCC TGGCCTITGC CCTGCCTTTT GTGGTCATCA CAGCCGCATA CGTGAGGATC 840 

20 CTGCAGCGCA TGACGTCCTC AGTGGCCCCC GCCTCCCAGC GCAGCATCCG GCTGCGGACA 900 ' 

AAGAGGGTGA CCCGCACAGC CATCGCCATC TGTCT.3GTCT TCTTTGTGTG CTGGGCACCC 960 

TACTATGTGC TACAGCTGAC CCAGTT^TCC ATCAGCCGCC CGACCCTCAC CtJtGTCTAC 1020 

TTATACAAT^ CGGCCATCAG CTTGGGCTAT GCCAACAGCT GCCTCAACCC CTTl^TOTAC 1080 

ATCGTGCTCT GTGAGACGTT CCGCAAACGC T^TCCTGT CGG^CC TGCAGCCCAG 1140 

25 GGGCAGCTTC GCGCl^TCAG CAACGCTCAG ACGGCTGACG AGGAGAGGAC AGAAAGCAAA 1200 
GGCACCTGA 

1209 

(53) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS. 
^r. <A) LENGTH: 402 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 52: 

Met Leu Cys Pro Ser Lys Thr Asp Gly Ser Gly His Ser Gly Arg He 
15 10 15 

His Gin Glu Thr His Gly Glu Gly Lys Arg Asp Lys He Ser Asn Ser 
20 25 30 

Glu Gly Arg Glu Asn Gly Gly Arg Gly Phe Gin Met Asn Gly Gly Ser 
35 40 45 

Leu Glu Ala Glu His Ala Ser Arg Met Ser Val Leu Arg Ala Lys Pro 
50 55 60 

Met Ser Asn Ser Gin Arg Leu Leu Leu Leu Ser Pro Gly Ser Pro Pro 
€5 70 75 80 

Arg Thr Gly Ser He Ser Tyr He Asn He He Met Pro Ser Val Phe 
85 90 95 

Gly Thr He Cys Leu Leu Gly He He Gly Asn Ser Thr Val He Phe 
100 105 110 

Ala .Val Val Lys Lys Ser Lys Leu His Trp Cys Asn Asn Val Pro Asp 
115 120 125 

He Phe He He Asn Leu Ser Val Val Asp Leu Leu Phe Leu Leu Gly 
130 135 140 

Met Pro Phe Met lie His Gin Leu Met Gly Asn Gly Val Trp His Phe 
145 150 155 160 

Gly Glu Thr Met Cys Thr Leu He Thr Ala Met Asp Ala Asn Ser Gin 
165 170 175 

Phe Thr Ser Thr Tyr He Leu Thr Ala Met Ala He Asp Arg Tyr Leu 
180 185 190 

Ala Thr Val His Pro He Ser Ser Thr Lys Phe Arg Lys Pro Ser Val 
195 200 205 

Ala Thr Leu Val He Cys Leu Leu Trp Ala Leu Ser Phe He Ser He 
210 215 220 

Thr Pro Val Trp Leu Tyr Ala Arg Leu He Pro Phe Pro Gly Gly Ala 
225 230 235 240 

Val Gly Cys Gly He Arg Leu Pro Asn Pro Asp Thr Asp Leu Tyr Trp 
245 250 255 
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Phe Thr Leu Tyr Gin Phe Phe Leu Ala Phe Ala Leu Pro Phe Val Val 
260 265 270 

He Thr Ala Ala Tyr Val Arg He Leu Gin Arg Met Thr Ser Ser Val 
275 280 235 

Ala Pro Ala Ser Gin Arg Ser He Arg Leu Arg Thr Lys Arg Val Thr 

295 300 . 

^g Thr Ala He Ala He Cys Leu Val Phe Phe Val Cys Tip Ala Pro 
• 310 315 

Tyr Tyr Val Leu Gin Leu Thr Gin Leu Ser He Ser Arg Pro Thr Leu 
325 330 

Thr Phe Val Tyr Leu Tyr Asn Ala Ala He Ser Leu Gly Tyr Ala Asn 

345 350 

Ser Cys Leu Asn Pro Phe Val Tyr He Val Leu Cys Glu Thr Phe Arcr 
355 360 365 

Lys Arg Leu Val Leu Ser Val Lys Pro Ala Ala Gin Gly Gin Leu Arc 
^■'O 375 380 

Ala Val Ser Asn Ala Gin Thr Ala Asp Glu Glu Arg Thr Qlu Ser Lys 

385 390 

Gly Thr 

(54) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

GGCGGATCCA T6GATGTGAC TTCCCAA 
(55) INFORMATION FOR SEQ ID NO:54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 



395 400 
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GGCGGATCCC TACACGGCAC TGCTGAA 27 
(56) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs ^ 

5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55:. 

10 ATGGATGTGA CTTCCCAAGC CCGGGGCGTG GGCCTGGAGA TGTACCCAGG CACCGCGCAC 60 

GCTGCGGCCC CCAACACCAC CTCCCCCGAG CTCAACCTGT CCCACCCGCT CCTGGGCACC 120 

GCCCTGGCCA ATGGGACAGG TGAGCTCTCG GAGCACCAGC AGTACGTGAT CGGCCTGTTC 180 

CTCTCGTGCC TCTACACCAT CTTCCTCTTC CCCATCGGCT TTGTGGGCAA CATCCTGATC 240 

CTGGTGGTGA ACATCAGCTT CCGCGAGAAG ATGACCATCC CCGACCTGTA CTTCATCAAC 300 

15 CTGGCGGTGG CGGACCTCAT CCTGGTGGCC GACTCCCTCA TTGAGGTGTT CAACCTGCAC 360 

GAGCGGTACT ACGACATCGC CGTCCTGTGC ACCTTCATGT CGCTCTTCCT GCAGGTCAAC 420 

ATGTACAGCA GCGTCTTCTT CCTCACCTGG ATGAGCTTCG ACCGCTACAT CGCCCTGGCC 480 

AGGGCCATGC GCTGCAGCCT GTTCCGCACC AAGCACCACG CCCGGCTGAG CTGTGGCCTC 540 

ATCTGGATGG CATCCGTGTC AGCCACGCTG GTGCCCTTCA CCGCCGTGCA CCTGCAGCAC 600 

20 ACCGACGAGG CCTGCTTCTG TTTCGCGGAT GTCCGGGAGG TGCAGTGGCT CGAGGTCACG 660 

CTGGGCTTCA TCGTGCCCTT CGCCATCATC GGCCTGTGCT ACTCCCTCAT TGTCCGGGTG 720 

CTGGTCAGGG CGCACCGGCA CCGTGGGCTG CGGCCCCGGC GGCAGAAGGC GCTCCGCATG 780 

ATCCTCGCGG TGGTGCTGGT CTTCTTCGTC TGCTGGCTGC CGGAGAACGT CTTCATCAGC 840 

GTGCACCTCC TGCAGCGGAC GCAGCCTGGG GCCGCTCCCT GCAAGCAGTC TTTCCGCCAT 900 

25 GCCC:ACCCCC TCACGGGCCA CATTGTCAAC CTCGCCGCCT TCTCCAACAG CTGCCTAAAC 960 

CCCCTCATCT ACAGCTTTCT CGGGGAGACC TTCAGGGACA AGCTGAGGCT GTACATTGAG 1020 

CAGAAAACAA ATTTGCCGGC CCTGAACCGC TTCTGTCACG CTGCCCTGAA GGCCGTCATT 1080 

CCAGACAGCA CCGAGCAGTC GGATGTGAGG TTCAGCAGTG CCGTGTGA 1128 
(57) INFORMATION FOR SEQ ID NO: 56: 
30 (i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 
(Cj STRANDEDNESS: 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Met Asp Val Thr Ser Gin Ala Arg Gly Val Gly Leu Glu Met Tyr Pro 
^ 10 . 15 

Gly Thr Ala His Ala Ala Ala Pro Asn Thr Thr Ser Pro Glu Leu Asn 

20. • 25 30 

Leu ser His Pro Leu Leu Gly Thr Ala Leu Ala Asn Gly Thr Gly Glu 

^0 45 

Leu ser Glu His Gin Gin Tyr Val He Gly Leu Phe Leu Ser Cys Leu 



^° 55 60 



Tyr Thr lie Phe Leu Phe Pro He Gly Phe Val Gly Asn He Leu He 

75 80 

Leu val val Asn He Ser Phe Arg Glu Lys Met Thr He Pro Asp Leu 



90 95 



Tyr Phe He Asn Leu Ala Val Ala Asp Leu He Leu Val Ala Asp Ser 
100 105 110 

Leu He Glu Val Phe Asn Leu His Glu Arg Tyr Tyr Asp He Ala Val 
115 120 125 

Leu cys Thr Phe Met Ser Leu Phe Leu Gin Val Asn Met Tyr Ser Ser 

140 

val Phe Phe Leu Thr Trp Met Ser Phe Asp Arg Tyr He Ala Leu Ala 

"° 155 160 

Arg Ala Met Arg Cys Ser Leu Phe Arg Thr Lys His His Ala Arg Leu 
165 170 175 

ser cys Gly Leu He Trp Met Ala Ser Val Ser Ala Thr Leu Val Pro 
180 185 190 

Phe Thr Ala Val His Leu Gin His Tia: Asp Glu Ala Cys Phe Cys Phe 
135 200 205 

Ala Asp val Arg Glu Val Gin Trp Leu Glu Val Thr Leu Gly Phe He 

215 220 

val Pro Phe- Ala He He Gly Leu Cys Tyr Ser" Leu He Val Arg Val 

240 

Leu val Ar^ Ala His Arg His Arg Gly Leu Arg Pro Arg Arg Gin Lys 
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245 250 255 

Ala Leu Arg Met He Leu Ala Val Val Leu Val Phe Phe Val Cys Trp 
260 265 270 

Leu Pro Glu Asn Val Phe He Ser Val His Leu Leu Gin Arg Thr Gin 
5 275 280 285 

Pro Gly Ala Ala Pro Cys Lys Gin Ser Phe Arg His Ala His Pro Leu 
290 295 300 

Thr Gly His He Val Asn Leu Ala Ala Phe Ser Asn Ser Cys Leu Asn 
305 310 315 320 

10 Pro Leu He Tyr Ser Phe Leu Gly Glu Thr Phe Arg Asp Lys Leu Arg 

325 330 335 

Leu Tyr He Glu Gin Lys Thr Asn Leu Pro Ala Leu Asn Arg Phe Cys 
340 345 350 

His Ala Ala Leu Lys Ala Val He Pro Asp Ser Thr Glu Gin Ser Asp 
15 355 360 365 

Val Arg Phe Ser Ser Ala Val 
370 375 

(58) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

AAGGAATTCA CGGCCGGGTG ATGCCATTCC C 31 

(59) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3 0 base pairs 

30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
35 GGTGGATCCA TAAACACGGG CGTTGAGGAC 30 

(60) INFORMATION FOR SEQ ID NO:59: 



\ 
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(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 960 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: 

ATGCCATTCC CAAACTGCTC AGCCCCCAGC ACTGTCGTGG CCACAGCTGT GGGTGTCTTG 
CTGGGGCTGG AGTGTGGGCT GGGTCTGCTG GGCAACGCGG TGGCGCTGTG GACCTTCCTG 
TTCCGGGTCA GGGTGTGGAA GCCGTACGCT GTCTACCTGC TCAACCTGGC CCTGGCTGAC 
CTGCTGITGG CTGCGTGCCT GCCTTTCCTG GCCGCCTTCT ACCTOAGCCT CCAGGCTTGG 
CATCTGGGCC GTGT6GGCTG CTGGGCCCTG CGCTTCCTGC TGGACCTCAG CCGCAGCGTG 
GGGATGGCCT TCCTGGCCGC CGTGGCTTTG GACCGGTACC TCCGTGTGGT CCACCCTCGG 
CTTAAGGTCA ACCTGCTGTC TCCTCAGGCG GCCCTGGGGG TCTCGGGCCT CGTCTGGCTC 
15 CTGATGGTCG CCCTCACCTG CCCGGGCTTG CTCATCTCTG AGGCCGCCCA GAACTCCACC 
AGGTGCCACA GTTTCTACTC CAGGGCAGAC GGCTCCTTCA GCATCATCTG GCAGGAAGCA 
CTCTCCTGCC TTCAGTTTGT CCTCCCCTTT GGCCTCATCG TGTTCTGCAA TGCAGGCATC 
ATCAGGGCTC TCCAGAAAAG ACTCCGG6AG CCTGAGAAAC AGCCCAAGCT TCAGCGGGCC 
CAGGCACTGG TCACCTTGGT GGTGGTGCTG TTTGCTCTGT GCTTTCTGCC CTGCTTCCTG 
20 GCCAGAGTCC TGATGCACAT CTTCCAGAAT CTGGGGAGCT GCAGGGCCCT TTGTGCAGTG 
GCTCATACCT CGGATGTCAC GGGCAGCCTC ACCTACCTGC ACAGTGTCGT CAACCCCGTG 
GTATACTGCT TCTCCAGCCC CACCrrCAGG AGCTCCTATC GGAGGGTCTT CCACACCCTC 
CGAGGCAAAG GGCAGGCAGC AGAGCCCCCA GATTTCAACC CCAGAGACTC CTATTCCTGA 
(61) INFORMATION FOR SEQ ID NO: 60: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 319 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



30 



(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
Met Pro Phe Pro Asn Cys Ser Ala Pro Ser Thr Val Val Ala Thr Ala 
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15 10 15 

Val Gly Val Leu Leu Gly Leu Glu Cys Gly Leu Gly Leu Leu Gly Asn 
20 25 30 

Ala Val Ala Leu Trp Thr Phe Leu Phe Arg Val Arg Val Trp Lys Pro 
35 40 45 

Tyr Ala Val Tyr Leu Leu Asn Leu Ala Leu Ala Asp Leu Leu Leu Ala 
50 55 60 

Ala Cys Leu Pro Phe Leu Ala Ala Phe Tyr Leu Ser Leu Gin Ala Trp 
65 70 75 80 

His Leu . Gly Arg Val Gly Cys Trp Ala Leu Arg Phe Leu Leu Asp Leu 
85 90 95 

Ser Arg Ser Val Gly Met Ala Phe Leu Ala Ala Val Ala Leu Asp Arg 
100 105 110 

Tyr Leu Arg Val Val His Pro Arg Leu Lys Val. Asn Leu Leu Ser Pro 
115 120 125 

Gin Ala Ala Leu Gly Val Ser Gly Leu Val Trp Leu Leu Met Val Ala 
130 135 140 

Leu Thr Cys Pro Gly Leu Leu He Ser Glu Ala Ala Gin Asn Ser Thr 
145 150 155 160 

Arg Cys His Ser Phe Tyr Ser Arg Ala Asp Gly Ser Phe Ser He He 
165 170 175 

Trp Gin Glu Ala Leu Ser Cys Leu Gin Phe Val Leu Pro Phe Gly Leu 
180 185 190 

He Val Phe Cys Asn Ala Gly He He Arg Ala Leu Gin Lys Arg Leu 
195 200 205 

Arg Glu Pro Glu Lys Gin Pro Lys Leu Gin Arg Ala Gin Ala Leu Val 
210 215 220 

Thr Leu Val Val Val Leu Phe Ala Leu Cys Phe Leu Pro Cys Phe Leu 
225 230 235 240 

Ala Arg Val Leu Met His He Phe Gin Asn Leu Gly Ser Cys Arg Ala 
245 250 255 

Leu Cys Ala Val Ala His Thr Ser Asp Val Thr Gly Ser Leu Thr Tyr 
260 265 270 

Leu His Ser Val Val Asn Pro Val Val Tyr Cys Phe Ser Ser Pro Thr 
275 280 285 

Phe Arg Ser Ser Tyr Arg Arg Val Phe His Thr Leu Arg Gly Lys Gly 
290 295 300 
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Gin Ala Ala Glu Pro Pro Asp Phe Asn Pro Arg Asp Ser Tyr Ser 

310 

(62) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1143 base pairs 

. (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECOLE TYPE: DNA (genomic) 

(xi) SEQOESCE DESCRIPTION: SEQ ID NO: 61: 
ATGGAGGAAG GTGGTGATTT TGACAACTAC TATGGGGCAG ACAACCAGTC TGAGT6TSAG 60 
TACACAGACT GGAAATCCTC GGGGGCCCTC ATCCCTGCCA TCTACATCTT GGTCTTCCTC 120 
CTGGGCACCA CGGGAAACGG TCTGGTGCTC TGGACCGTGT TTCGGAGCAG CCGGGAGAAG 180 
AGGCGCTCAG CTGATATCTT CATTGCTAGC CTGGCGGTGG CTGACCTGAC CTTCGTGGTG 240 
ACGCTGCCCC TGTGGGCTAC CTACACGTAC CGGGACTATG ACTGGCCCTT TGGGACCTTC 300 
TTCTGCAAGC TCAGCAGCTA CCTCATCTTC GTCAACATCT ACGCCAGCGT CTTCT6CCTC 360 
ACCGQCCTCA GCTTCGACCG CTACCTGGCC ATCGTCAGGC CAGTCGCCAA TGCTCGGCT6 420 
AGGCTGCGGG TCAGCGGGGC CGTGGCCACG GCAGTTCTTT GGGTGCTGGC CGCCCTCCTG 480 
GCCATGCCTG TCATGGTGTT ACGCACCACC GGGGACTTGG AGAACACCAC TAAGGTGCAG 540 
TGCTACATGG ACTACTCCAT GGTGGCCACT GTGAGCTCAG AGTGGGCCTG GGAGGTGGGC 600 
CTTGGGGTCT CGTCCACCAC CGTGGGCTTT GTGGTGCCCT TCACCATCAT GCTGACCTGT 660 
TACTTCTTGA TCGCCCAAAC CATCGCTGGC CACTTCCGCA AGGAACGCAT CGAGGGCCTG 720 
CGGAAGCGGC GCCGGCTGCT CAGCATCATC GTGGTGCTGG TGGTGACCTT TGCCCTGTCC 780 
TGGATGCCCT ACCACCTGGT GAAGACGCTG TACATGCTCG GCAGCCTCCT GCACTOGCCC 840 
TGTGACTTTG ACCTCTTCCT CATGAACATC TTCCCCTACT GCACCTGCAT CAGCTACGTC 900 
AACAGCTGCC TCAACCCCTT CCTCTATGCC TTTTTCGACC CCCGCTTCCG CCAGGCCTGC 960 
ACCTCCATGC TCTGCTGTGG CCAGAGCAGG TGCGCAGGCA CCTCCCACAG CAGCA6TGGG 1020 
GAGAAOTCAG CCAGCTACTC TTCGGGGCAC AGCCAGGGGC CCGGCCCCAA CATGGGCAAG 1080 
GGTGGAGAAC AGATGCACGA GAAATGCATC CCCTACAGCC AGGAGACCCT TGTGGTTGAC 1140 
TAG 

1143 
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(63) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 380 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUfENCE DESCRIPTION: SEQ ID NO: 62: 

Met Glu Glu Gly Gly Asp Phe Asp Asn Tyr Tyr Gly Ala Asp Asn Gin 
1 5 ■ 10 15 

Ser Glu Cys Glu Tyr Thr Asp Trp Lys Ser Ser Gly Ala Leu He Pro 
20 25 30 

Ala He Tyr Met Leu Val Phe Leu Leu Gly Thr Thr Gly Asn Gly Leu 
35 40 45 

Val Leu Trp Thr Val Phe Arg Ser Ser Arg Glu Lys Arg Arg Ser Ala 
50 55 60 

Asp He Phe He Ala Ser Leu Ala Val Ala Asp Leu Thr Phe Val Val 
65 70 75 80 

Thr Leu Pro Leu Trp Ala Thr Tyr Thr Tyr Arg Asp Tyr Asp Trp Pro 
85 90 95 

Phe Gly Thr Phe Phe Cys Lys Leu Ser Ser Tyr Leu He Phe Val Asn 
100 105 110 

Met Tyr Ala Ser Val Phe Cys Leu Thr Gly Leu Ser Phe Asp Arg Tyr 
115 120 125 

Leu Ala He Val Arg Pro Val Ala Asn Ala Arg Leu Arg Leu Arg Val 
130 135 140 

Ser Gly Ala Val Ala Thr Ala Val Leu Trp Val Leu Ala Ala Leu Leu 
145 150 155 

Ala Met Pro Val Met Val Leu Arg Thr Thr Gly Asp Leu Glu Asn Thr 
165 170 . 175 

Thr Lys Val Gin Cys Tyr Met Asp Tyr Ser Met Val Ala Thr Val Ser 
180 185 190 

Ser Glu Trp Ala Trp Glu Val Gly Leu Gly Val Ser Ser Thr Thr Val 
195 200 205 

Gly Phe Val Val Pro Phe Thr He Met Leu Thr Cys Tyr Phe Phe He 
210 215 220 

Ala Gin Thr He Ala Gly His Phe Arg Lys Glu Arg He Glu Gly Leu 
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"° 235 
Arg Lys Arg Arg Arg Leu Leu Ser He lie Val Val Leu Val Val Thr 

250 255 
Phe Ala Leu Cys Trp Met Pro Tyr His Leu Val Lys Thr Leu Tyr Met 

270 

Leu Gly Ser Leu Leu His Trp Pro Cys Asp Phe Asp Leu Phe Leu Met 

280 285 

Asn lie Phe Pro Tyr Cys Thr Cys lie Ser Tyr Val Asn Ser Cys Leu 

300 

Asn Pro Phe Leu Tyr Ala Phe Phe Asp Pro Arg Phe Arg Gin Ala Cys 



15 



Thr Ser Met Leu Cys Cys Gly Gin Ser Arg Cys Ala Gly Thr Ser His 

"0 

ser ser Ser Gly Glu Lys Ser Ala Ser Tyr Ser. Ser Gly His Ser Gin 

345 

Gly Pro Gly Pro Asn Met Gly Lys Gly Gly Glu Gin Met His Glu Lys 

365 

Ser He Pro Tyr Ser Gin Glu Thr Leu Val Val Asp 

370 -inc ^ 

380 

20 (64) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 
^. (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
TGAGAATTCT GGTGACTCAC AGCCGGCACA G 
(65) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 



31 
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GCCGGATCCA AGGAAAAGCA GCAATAAAAG G 31 
(66) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1119 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

ATGAACTACC CGCTAACGCT GGAAATGGAC CTCGAGAACC TGGAGGACCT GTTCTGGGAA 60 

CTGGACAGAT TGGACAACTA TAACGACACC TCCCTGGTGG AAAATCATCT CTGCCCTGCC 120 

ACAGAGGGTC CCCTCATGGC CTCCTTCAAG GCCGTGTTCG TGCCCGTGGC CTACAGCCTC 180 

ATCTTCCTCC TGGGCGTGAT CGGCAACGTC CTGGTGCTGG TGATCCTGGA GCGGCACCGG 240 

CAGACACGCA GTTCCACGGA GACCTTCCTG TTCCACCTGG CCGTGGCCGA CCTCCTGCTG 300 

GTCTTCATCT TGCCCTTTGC CGTGGCCGAG GGCTCTGTGG GCTGGGTCCT GGGGACCTTC 360 

CTCTGCAAAA CTGTGATTGC CCTGCACAAA GTCAACTTCT ACTGCAGCAG CCTGCTCCTG 420 

GCCTGCATCG CCGTGGACCG CTACCTGGCC ATTGTCCACG CCGTCCATGC CTACCGCCAC 480 

CGCCGCCTCC TCTCCATCCA CATCACCTGT GGGACCATCT GGCTGGTGGG CTTCCTCCTT 540 

GCCTTGCCAG AGATTCTCTT CGCCAAAGTC AGCCAAGGCC ATCACAACAA CTCCCTGCCA 600 

CGTTGCACCT TCTCCCAAGA GAACGAAGCA GAAACGCATG CCTGGTTCAC CTCCCGATTC 660 

CTCTACCATG TGGCGGGATT CCTGCTGCCC ATGCTGGTGA TGGGCTGGTG CTACGTGGGG 720 

GTAGTGCACA G6TTGCGCCA GGCCCAGCGG CGCCCTCAGC GGCAGAAGGC AGTCAGGGTG 780 

GCCATCCTGG TGACAAGCAT CTTCTTCCTC TGCTGGTCAC CCTACCACAT CGTCATCTTC 840 

CTGGACACCC TGGCGAGGCT GAAGGCCGTG GACAATACCT GCAAGCTGAA TGGCTCTCTC 900 

CCCGTGGCCA TCACCATGTG TGAGTTCCTG GGCCTGGCCC ACTGCTGCCT CAACCCCATG 960 

CTCTACACTT TCGCCGGCGT GAAGTTCCGC AGTGACCTGT CGCGGCTCCT GACCAAGCTG 1020 

GGCTGTACCG GCCCTGCCTC CCTGTGCCAG CTCTTCCCTA GCTGGCGCAG GAGCAGTCTC 1080 

TCTGAGTCAG AGAATGCCAC CTCTCTCACC ACGTTCTAG 1119 
(67) INFORMATION FOR SEQ ID NO: 66: 
(i) SEQUENCE CHARACTERISTICS: 



wo 00/22129 



PCTAJS99/23938 



52 

(A) LENGTH: 372 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECOLE TYPE: protein 

(xi) SEQOENCE DESCRIPTION: SEQ ID NO: 66: 

Met Asn Tyr Pro Leu Thr Leu Glu Met Asp Leu Glu Asn Leu Glu Asp 
^ 10 15 

Leu Phe Trp Glu Leu Asp Arg Leu Asp Asn Tyr Asn Asp Thr Ser Leu 
20 25 30 

Val Glu Asn His Leu Cys Pro Ala Thr Glu Gly Pro Leu Met Ala Ser 
35 40 45 

Phe Lys Ala Val Phe Val Pro Val Ala Tyr Ser Leu He Phe Leu Leu 

55 . 60 

Gly val lie Gly Asn Val Leu Val Leu Val lie Leu Glu Arg His Arg 



75 



80 



Gin Thr Arg Ser Ser Thr Glu Thr Phe Leu Phe His Leu Ala Val Ala 



85 90 95 



Asp Leu Leu Leu Val Phe He Leu Pro. Phe Ala Val Ala Glu Gly Ser 
100 105 

val Gly Trp Val Leu Gly Thr Phe Leu Cys Lys Thr Val He Ala Leu 
115 

His Lys Val Asn Phe Tyr Cys Ser Ser Leu Leu Leu Ala Cys He Ala 
"0 135 - 140 

val Asp Arg Tyr Leu Ala He Val His Ala Val His Ala Tyr Arg His 

150 155 160 

Arg Arg Leu Leu Ser He His He Thr Cys Gly Thr He Trp Leu Val 
165 170 175 

Gly Phe Leu Leu Ala Leu Pro Glu He Leu Phe Ala Lys Val ser Gin 
180 185 190 

Gly His His Asn Asn Ser Leu Pro Arg Cys Thr Phe Ser Gin Glu Asn 



195 200 



205 



Gin Ala Glu Thr His Ala Trp Phe Thr Ser Arg Phe Leu Tyr His Val 
210 215 220 

Ala Gly Phe Leu Leu Pro Met Leu Val Met Gly Trp Cys Tyr Val Gly 
" 230 235 240 

val val His Arg Leu Axg Gin Ala Gin Arg Arg Pro Gin Arg Gin Lys 
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245 



250 



255 



Ala Val Arg Val Ala lie Leu Val Thr Ser lie Phe Phe Leu Cys Trp 
260 265 270 

Ser Pro Tyr His lie Val He Phe Leu Asp Thr Leu Ala Arg Leu Lys 
275 280 285 

Ala Val Asp Asn Thr Cys Lys Leu Asn Gly Ser Leu Pro Val Ala He 
290 295 300 

Thr Met Cys Glu Phe Leu Gly Leu Ala His Cys Cys Leu Asn Pro Met 
305 310 315 320 

Leu Tyr Thr Phe Ala Gly Val Lys Phe Arg Ser Asp Leu Ser Arg Leu 
325 330 335 

Leu Thr Lys Leu Gly Cys Thr Gly Pro Ala Ser Leu Cys Gin Leu Phe 
340 345 350 

Pro Ser Trp Arg Arg Ser Ser Leu Ser Glu Ser Glu Asn Ala Thr Ser 
355 360 365 

Leu Thr Thr Phe 
370 



(68) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 



CAAAGCTTGA AAGCTGCAC6 GTGCAGAGAC 30 
(69) INFORMATION FOR SEQ ID NO:68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 



GCGGATCCCG AGTCACACCC TGGCTGGGCC 
(70) INFORMATION FOR SEQ ID NO: 69: 



30 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

ATGGATGTGA CTTCCCAAGC CCGGGGCGTG GGCCTGGAGA TGTACCCAGG CACCGCGCAG 60 

CCTGCGGCCC CCAACACCAC CTCCCCCGAG CTCAACCTGT CCCACCCGCT CCTGGGCACC 120 

10 GCCCT6GCCA ATGGGACAGG TGAGCTCTCG GAGCACCAGC AGTACGTCAT CGGCCTGTTC 180 

CTCTCGTGCC TCTACACCAT CTTCCTCTTC CCCATCGGCT TTGTGGGCAA CATCCTGATC 240 

CTGGTGGTGA ACATCA6CTT CCGCGAfiAAG ATGACCATCC CCGACCT6TA CTTCATCAAC 300 

CTGGCGGTGG CGGACCTCAT CCTGGTGGCC GACTCCCTCA TTGAGGTGTT CAACCTGCAC 360 

GAGCGGTACT ACGACATCGC CGTCCTGTGC ACCTTCATGT CGCTCTTCCT QCAGGTCAAC 420 

15 ATGTACAGCA GCGTCTTCTT CCTCACCTCG ATGAGCTTCG ACCGCTACAT CGCCCTOGCC 480 

AGGGCCATGC GCTGCAGCCT GTTCCGCACC AAGCACCACG CCCGGCT6AG CTGTGGCCTC 540 

ATCTGGATGG CATCCGTGTC AGCCACGCTG GTGCCCTTCA CCGCCGTGCA CCTGCAGCAC 600 

ACCGACGAGG CCTGCTTCTG TTTCGCGGAT GTCCGGGAGG TGCAGTGGCT CGAGGTCACG 660 

CTGGGCTTCA TC6TGCCCTT CGCCATCATC GGCCTGTGCT. ACTCCCTCAT TGTCCGG6TG 720 

20 CTGGTCAGGG CGCACOGGCA CCGTGGGCTG CGGCCCCGGC GGCAGAAGGC GCTCCGCATG 780 

ATCCTCGCGG TGGTGCTGGT CTTCTTCGTC TGCTGGCTGC CGGAGAACGT CTTCATCAGC 840 

GTGCACCTCC TGCAGCGGAC GCAGCCTGGG GCCGCTCCCT GCAAGCAGTC TTTCCQCCAT 900 

GCCCACCCCC TCACGGGCCA CATTGTCAAC CTCACCGCCT TCTCCAACA6 CTGCCTAAAC 960 

CCCCTCATCT ACAGCITTCT CGGGGAGACC TTCAGGGACA AGC7.5AGGCT GTACATTGAG 1020 

25 CAGAAAACAA ATTTGCCGGC CCTGAACCGC TTCTGTCACG CTGCCCTCAA GGCCGTCATT 1080 

CCAGACAGCA CCGAGCAGTC GGATGTGAGQ TTCAGCAGTG CCGTOTAG nag 
(71) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 375 amino acids 
^" (B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 
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(ii) MOLEOJLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Met Asp Val Thr Ser Gin Ala Arg Gly Val Gly Leu Glu Met Tyr Pro 
15 10 15 

Gly Thr Ala Gin Pro Ala Ala Pro Asn Thr Thr Ser Pro Glu Leu Asn 
20 25 30 

Leu Ser His Pro Leu Leu Gly Thr Ala Leu Ala Asn Gly Thr Gly Glu 
35 40 45 

Leu Ser Glu His Gin Gin Tyr Val He Gly Leu Phe Leu Ser Cys Leu 
50 55 60 

Tyr Thr He Phe Leu Phe Pro He Gly Phe Val Gly Asn He Leu He 
65 70 75 80 

Leu Val Val Asn He Ser Phe Arg Glu Lys Met Thr He Pro Asp Leu 
85 90 95 

Tyr Phe He Asn Leu Ala Val Ala Asp Leu He Leu Val Ala Asp Ser 
100 105 110 

Leu He Glu Val Phe Asn Leu His Glu Arg Tyr Tyr Asp He Ala Val 
115 120 125 

Leu Cys Thr Phe Met Ser Leu Phe Leu Gin Val Asn Met Tyr Ser Ser 
130 135 140 

Val Phe Phe Leu Thr Trp Met Ser Phe Asp Arg Tyr He Ala Leu Ala 
145 150 155 160 

Arg Ala Met Arg Cys Ser Leu Phe Arg Thr Lys His His Ala Arg Leu 
165 170 175 

Ser Cys Gly Leu He Trp Met Ala Ser Val Ser Ala Thr Leu Val Pro 
180 185 190 

Phe Thr Ala Val His Leu Gin His Thr Asp Glu Ala Cys Phe Cys Phe 
195 200 205 

Ala Asp Val Arg Glu Val Gin Trp Leu Glu Val Thr Leu Gly Phe He 
210 215 220 

Val Pro Phe Ala He He Gly Leu Cys Tyr Ser Leu He Val Arg Val 
225 230 235 240 

Leu Val Arg Ala His Arg His Arg Gly Leu Arg Pro Arg Arg Gin Lys 
245 250 255 

Ala Leu Arg Met He Leu Ala Val Val Leu Val Phe Phe Val Cys Trp 
260 265 270 
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Leu Pro Glu Asn Val Phe He Ser Val His Leu Leu Gin Arg Thr Gin 
275 280 285 

Pro Gly Ala Ala Pro Cys Lys Gin Ser Phe Arg His Ala His Pro Leu 
29C 295 

Thr Gly His He Val Asn Leu Thr Ala Phe Ser Asn Ser Cys Leu Asn 

^^"^ 315 320 

Pro Leu lie Tyr Ser Phe Leu Gly Glu Thr Phe Arg Asp Lys Leu Arg 
325 330 

Leu ryr He Glu Gin Lys Thr Asn Leu Pro Ala Leu Asn Arg Phe Cys 

345 350 

His Ala Ala Leu Lys Ala Val He Pro Asp Ser Thr Glu Gin Ser Asp 



360 



355 

Val Arg Phe Ser Ser Ala Val 
370 375 

(72) INFORMATION FOR SEQ ID NO: 71: 

^ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71; 

ACAGAATTCC TGTGTGGTTT TACCGCCCAG 

(73) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
CTCGGATCCA GGCAGAAGAG TCGCCTATGG 
(74) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1137 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



365 



30 



30 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

ATGGACCTGG GGAAACCAAT GAAAAGCGTG CTGGTGGTGG CTCTCCTTGT CATTTTCCAG 60 

5 GTATGCCTGT GTCAAGATGA GGTCACGGAC GATTACATCG GAGACAACAC CACAGTGGAC 120 

TACACTTTGT TCGAGTCTTT GTGCTCCAAG AAGGACGTGC GGAACTTTAA AGCCTGGTTC 180 

CTCCCTATCA TGTACTCCAT CATTTGTTTC GTGGGCCTAC TGGGCAATGG GCTGGTCGTG 24 0 

TTGACCTATA TCTATTTCAA GAGGCTCAAG ACCATGACCG ATACCTACCT GCTCAACCTG 300 

GCGGTGGCAG ACATCCTCTT CCTCCTGACC CTTCCCTTCT GGGCCTACAG CGCGGCCAAG 360 

10 TCCTGGGTCT TCGGTGTCCA CTTTTGCAAG CTCATCTTTG CCATCTACAA GAT6AGCTTC 420 

TTCAGTGGCA TGCTCCTACT TCTTTGCATC AGCATTGACC GCTACGTGGC CATCGTCCAG 480 

GCTGTCTCAG CTCACCGCCA CCGTGCCCGC GTCCTTCTCA TCAGCAAGCT GTCCTGTGTG 540 

GGCATCTGGA TACTAGCCAC AGTGCTCTCC ATCCCAGAGC TCCTGTACAG TGACCTCCAG 600 

AGGAGCAGCA GTGAGCAAGC GATGCGATGC TCTCTCATCA CAGAGCATGT GGAGGCCTTT 660 

15 ATCACCATCC AGGTGGCCCA GATGGTGATC GGCTTTCTGG TCCCCCTGCT GGCCATGAGC 720 

TTCTGTTACC TTGTCATCAT CCGCACCCTG CTCCAGGCAC GCAACTTTGA GCGCAACAAG 780 

GCCATCAAGG TGATCATCGC TGTGGTCGTG GTCTTCATAG TCTTCCAGCT GCCCTACAAT 840 

GGGGTGGTCC TGGCCCAGAC GGTGGCCAAC TTCAACATCA CCAGTAGCAC CTGTGAGCTC 900 

AGTAAGCAAC TCAACATCGC CTACGACGTC ACCTACAGCC TGGCCTGCGT CCGCTGCTGC 960 

20 GTCAACCCTT TCTTGTACGC CTTCATCGGC GTCAAGTTCC GCAACGATCT CTTCAAGCTC 1020 

TTCAAGGACC TGGGCTGCCT CAGCCAGGAG CAGCTCCGGC AGTGGTCTTC CTGTCGGCAC 1080 

ATCCGGCGCT CCTCCATGAG TGTGGAGGCC GAGACCACCA CCACCTTCTC CCCATAG 1137 
(75) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 378 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
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Met Asp Leu Gly Lys Pro Met Lys Ser Val Leu Val Val Ala Leu Leu 
5 10 ^5 



val He Phe Gin Val Cys Leu Cys Oln Asp Glu Val Thr Asp Asp Tyr 

He Gly Asp Asn Thr Thr Val Asp Tyr Thr Leu Phe Glu Ser Leu Cys 
35 40. 45 

ser Lys Lys Asp Val Arg Asn Phe Lys Ala Trp Phe Leu Pro He Met 
^° 55 60 

Tyr Ser He He Cys Phe Val Gly Leu Leu Gly Asn Gly Leu Val Val 
" ^° 75 80 

Leu Thr Tyr He Tyr Phe Lys Arg Leu Lys Thr Met Thr Asp Thr Tyr 
85 90 95 

• Leu Leu Asn Leu Ala Val Ala Asp He Leu Phe Leu Leu Thr 



100 105 



Leu Pro 
110 



Phe Trp Ala Tyr Ser Ala Ala Lys Ser Trp Val Phe Gly Val His Phe 
115 120 125 

Cys Lys Leu He Phe Ala He Tyr Lys Met Ser Phe Phe Ser Gly Met 
130 135 140 



Leu Leu Leu Leu Cys He Ser He Asp Arg Tyr Val Ala He Val 
145 150 155 



Gin 
160 

Ala val ser Ala His Arg His Arg Ala Arg Val Leu Leu He Ser Lys 
165 170 175 

Leu Ser Cys Val Gly He Trp He Leu Ala Thr Val Leu Ser He Pro 
180 185 190 

Glu Leu Leu Tyr Ser Asp Leu Gin Arg Ser Ser Ser Glu Gin Ala Met 
195 200 205 

Arg Cys Ser Leu He Thr Glu His Val Glu Ala Phe He Thr He Qln 
210 215 220 



Val Ala Oln Met Val He Gly Phe Leu Val Pro Leu Leu Ala 
225 230 235 

Phe Cys Tyr Leu Val He He Arg Thr Leu Leu Gin Ala Arg Asn Phe 



Met Ser 
240 



245 250 



255 



Glu Arg Asn Lys Ala He Lys Val He He Ala Val Val Val Val Phe 
260 265 270 

He Val Phe Gin Leu Pro Tyr Asn Gly Val Val Leu Ala Qln Thr Val 
275 280 285 
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Ala Asn Phe Asn lie Thr Ser Ser Thr Cys Glu Leu Ser Lys Gin Leu 
290 295 300 

Asn lie Ala.Tyr Asp Val Thr Tyr Ser Leu Ala Cys Val Arg Cys Cys 
305 310 315 320 

Val Asn Pro Phe Leu Tyr Ala Phe lie Gly Val Lys Phe Arg Asn Asp 
325 330 335 

Leu Phe Lys Leu Phe Lys Asp Leu Gly Cys Leu Ser Gin Glu Gin Leii 
340 345 350 

Arg Gin Trp Ser Ser Cys Arg His He Arg Arg Ser Ser Met Ser Val 
355 360 365 

Glu Ala Glu Thr Thr Thr Thr Phe Ser Pro 
370 375 

(76) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) n 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

CTGGAATTCA CCTGGACCAC CACCAATGGA TA 

(77) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
CTCGGATCCT GCAAAGTTTG TCATACAGTT 

(78) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1085 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
ATGGATATAC AAATGGCAAA CAATTTTACT CCGCCCTCTC CAACTCCTCA GGGAAATOAC 60 
TGTGACCTCT ATGCACATCA CAGCACGGCC AGGATAQTAA TGCCTCTGCA TTACA6CCTC 120 
GTCTTCATCA TTGGGCTCGT GGGAAACTTA CTAQCCTTGG TCGTCATTGT TCAAAACAGG 180 
AAAAAAATCA ACTCTACCAC CCTCTATTCA ACAAATTTGG TGATTTCTGA TATACTTTTT 240 
ACCACGGCTT TGCCTACACG AATAGCCTAC TATCCAATGG GCTTKSACTG GAGAATCQGA 300 
GAT6CCTTGT GTAGQATAAC TGCGCTAGTG TTTTACATCA ACACATATGC AGGTGTGAAC 360 
TTTATGACCT GCCTGAGTAT TGACCGCTTC ATTGCTGTGG TGCACCCTCT ACGCTACAAC 420 
AAGATAAAAA GGATTGAACA TGCAAAAGGC GTGTGCATAT TTGTCTGGAT TCTAGTATTT 480 
GCTCAGACAC TCCCACTCCT CATCAACCCT ATCTCAAAGC AGGAGGCTOA AAGGATTACA 540 
TGCATGGAGT ATCCAAACTT TGAAGAAACT AAATCTCTTC CCTGGATTCT GCTTGGGGCA 600 
TGTTTCATAG GATATGTACT TCCACTTATA ATCATTCTCA TCTGCTATTC TCAGATCTGC 660 
TGCAAACTCT TCAGAACTGC CAAACAAAAC CCACTCACTG AGAAATCTGG TGTAAACAAA 720 
AAGGCTCTCA ACACAATTAT TCTTATTATT GTTGTGTTTG TTCTCTGTTr CACACCTTAC 780 
CATGTTGCAA TTATTCAACA TATGATTAAG AAGCTTCGTT TCTCTAATTT CCTGGAATGT 840 
AGCCAAAGAC ATTCGTTCCA GATTTCTCTG CACTTTACAG TATGCCTGAT GAACTTCAAT 900 
TGCTGCATGG ACCCTTTTAT CTACTTCTTT GCATGTAAAG GGTATAAGAG AAAQGTTATG 960 
AGGATGCTGA AACGGCAAGT CAGTGTATCG ATTTCTAGTC CTGTGAAGTC AGCCCCTGAA 1020 
GAAAATTCAC GTGAAATGAC AGAAACGCAG ATGATGATAC ATTCCAAGTC TTCAAATGGA 1080 
AAGTGA 

1086 

(79) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 361 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Met Asp He Gin Met Ala Asn Asn Phe Thr Pro Pro Ser Ala Thr Pro 
^5 10 



15 
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Gin Gly Asn Asp Cys Asp Leu Tyr Ala His His Ser Thr Ala Arg He 
20 25 30 

Val Met Pro Leu His Tyr Ser Leu Val Phe He He Gly Leu Val Gly 
35 40 45 

5 Asn Leu Leu Ala Leu Val Val He Val Gin Asn Arg Lys Lys He Asn 

50 55 60 

Ser Thr Thr Leu Tyr Ser Thr Asn Leu Val He Ser Asp He Leu Phe 
65 70 75 80 

Thr Thr Ala Leu Pro Thr Arg He Ala Tyr Tyr Ala Met Gly Phe Asp 
10 85 90 95 

Trp Arg He Gly Asp Ala Leu Cys Arg He Thr Ala Leu Val Phe Tyr 
100 105 110 

He Asn Thr Tyr Ala Gly Val Asn Phe Met Thr Cys Leu Ser He Asp 
115 120 125 

15 Arg Phe He Ala Val Val His Pro Leu Arg Tyr Asn Lys He Lys Arg 

130 135 140 

He Glu His Ala Lys Gly Val Cys He Phe Val Trp He Leu Val Phe 
145 150 155 160 

Ala Gin Thr Leu Pro Leu Leu He Asn Pro Met Ser Lys Gin Glu Ala 
20 165 170 175 

Glu Arg He Thr Cys Met Glu Tyr Pro Asn Phe Glu Glu Thr Lys Ser 
180 185 190 

Leu Pro Trp He Leu Leu Gly Ala Cys Phe He Gly Tyr Val Leu Pro 
195 200 205 

25 Leu He He He Leu He Cys Tyr Ser Gin He Cys Cys Lys Leu Phe 

210 215 220 

Arg Thr Ala Lys Gin Asn Pro Leu Thr Glu Lys Ser Gly Val Asn Lys 
225 230 235 240 

Lys Ala Leu Asn Thr He He Leu He He Val Val Phe Val Leu Cys 
30 245 250 255 

Phe Thr Pro Tyr His Val Ala He He Gin His Met He Lys Lys Leu 
260 265 270 

Arg Phe Ser Asn Phe Leu Glu Cys Ser Gin Arg His Ser Phe Gin He 
275 280 285 

35 Ser Leu His Phe Thr Val Cys Leu Met Asn' Phe Asn Cys Cys Met Asp 

290 295 300 



Pro 



Phe He Tyr Phe Phe Ala Cys Lys Gly Tyr Lys Arg Lys Val Met 



wo 00/22129 



PCTAJS99/23W8 



62 

Arg Met Leu Lys Arg Gin Val Ser Val Ser He Ser Ser Ala Val Lys 



335 



ser Ala Pro Glu Glu Asn Ser Arg Glu Met Thr Glu Thr Gin Met Met 

345 



10 



lie His Ser Lys Ser Ser Asn Gly Lys 

(80) INFORMATION FOR SEQ ID NO: 79: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

fxi) SEQUENCE DESCRIPTION: SEQ ID NO:79: 

CTGGAATTCT CCTGCTCATC CAGCCATGCG G 

(81) INFORMATION FOR SEQ ID NO: 80: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
25 CCTGGATCCC CACCCCTACT GGGGCCTCAG 

(82) INFORMATION FOR SEQ ID NO: 81: 



30 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1446 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



350 



31 



30 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
ATGCGGTGGC TCTGGCCCCT GGCl^TCTCT CTTGCTK5TGA TTTTGGCTGT GGGGCTAAGC 60 
35 AOGGTCTCTG GGGGTGCCCC CCT^CACCTG GGCAGGOVCA GAGC 
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CAGAGCCGAT 


CCAAGAGGGG 


CACCGAGGAT 


GAGGAGGCCA AGGGCGTGCA GCAGTATGTG 


180 


CCTGAGGAGT 


GGGCGGAGTA 


CCCCCGGCCC 


ATTCACCCTG 


CTGGCCTGCA 


GCCAACCAAG 


240 


CCCTTGGTGG 


CCACCAGCCC 


TAACCCCGAC 


AAGGATGGGG 


GCACCCCAGA 


CA6TGGGCAG 


300 


GAACTGAGGG 


GCAATCTGAC 


AGGGGCACCA 


GGGCAGAGGC 


TACAGATCCA 


GAACCCCCTG 


360 


TATCCGGTGA 


CCGAGAGCTC 


CTACAGTGCC 


TATGCCATCA 


TGCTTCTGGC 


GCTGGTGGTG 


420 


TTTGCGGTGG 


GCATTGTGGG 


CAACCTGTCG 


GTCATGTGCA 


TCGTGTGGCA 


CAGCTACTAC 


480 


CTGAAGAGCG 


CCTGGAACTC 


CATCCTTGCC 


AGCCTGGCCC 


TCTGGGATTT 


TCTGGTCCTC 


540 


TTTTTCTGCC 


TCCCTATTGT 


CATCTTCAAC 


GAGATCACCA AGCAGAGGCT ACTGGGT6AC 


600 


GTTTCTTGTC 


GTGCCGTGCC 


CTTCATGGAG 


GTCTCCTCTC 


TGGGAGTCAC 


GACTTTCAGC 


660 


CTCTGTGCCC 


TGGGCATTGA 


CCGCTTCCAC 


GTGGCCACCA 


GCACCCTGCC 


CAAGGTGAGG 


720 


CCCATCGAGC 


GGTGCCAATC 


CATCCTGGCC 


AAGTTGGCTG 


TCATCTGGGT 


GGGCTCCATG 


780 


ACGCTGGCTG 


TGCCTGAGCT 


CCTGCTGTGG 


CAGCTGGCAC 


AGGAGCCTGC 


CCCCACCATG 


840 


GGCACCCTGG 


ACTCATGCAT 


CATGAAACCC 


TCAGCCAGCC 


TGCCCGAGTC 


CCTGTATTCA. 


900 


CTGGTGATGA 


CCTACCAGAA 


GGCCCGCATG 


TGGTGGTACT 


TTGGCTGCTA 


CTTCTGCCTG 


960 


CCCATCCTCT 


TCACAGTCAC 


CTGCCAGCTG 


GTGACATGGC 


GGGTGCGAGG 


CCCTCCAGGG 


1020 


AGGAAGTCAG 


AGTGCAGGGC 


CAGCAAGCAC 


GAGCAGTGTG 


AGAGCCAGCT 


CAACAGCACC 


1080 


GTGGTGGGCC 


TGACCGTGGT 


CTACGCCTTC 


TGCACCCTCC 


CAGAGAACGT 


CTGCAACATC 


1140 


GTGGTGGCCT 


ACCTCTCCAC 


CGAGCTGACC 


CGCCAGACCC 


TGGACCTCCT 


GGGCCTCATC 


1200 


AACCAGTTCT 


CCACCTTCTT 


CAAGGGCGCC 


ATCACCCCAG 


TGCTGCTGCT 


TTGCATCTGC 


1260 


AGGCCGCTGG 


GCCAGGCCTT 


CCTGGACTGC 


TGCTGCTGCT 


GCTGCTGTGA 


GGAGTGCGGC 


1320 


GGGGCTTCGG 


AGGCCTCTGC 


TGCCAATGGG 


TCGGACAACA AGCTCAAGAC 


CGAGGTGTCC 


1380 


TCTTCCATCT 


ACTTCCACAA 


GCCCAGGGAG 


TCACCCCCAC 


TCCTGCCCCT 


GGGCACACCT 


1440 


TGCTGA 












1446 



(83) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 481 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Met Arg Trp Leu Trp Pro Leu Ala Val Ser Leu Ala Val He Leu Ala 
^ 10 15 

val Gly Leu Ser Arg Val Ser Gly Gly Ala Pro Leu His Leu Gly Ara 
2° 25 30 

His Arg Ala Glu Thr Gin Glu Gin Gin Ser Arg Ser Lys Arg Gly Thr 
35 40 45 

Glu Asp Glu Glu Ala Lys Gly Val Gin Gin Tyr Val Pro Glu Glu Trp 

55 60 



Ala Glu Tyr Pro Arg Pro He His Pro Ala Gly Leu Gin Pro Thr Lys 
" 75 80 

Pro Leu Val Ala Thr Ser Pro Asn Pro Asp Lys Asp Gly Gly Thr Pro 
85 90 95 

Asp Ser Gly Gin Glu Leu Arg Gly Asn Leu Thr Gly Ala Pro Gly Gin 



100 105 



110 



Arg Leu Gin He Gin Asn Pro Leu Tyr Pro Val Thr Glu Ser Ser Tyr 
115 120 125 

ser Ala Tyr Ala He Met Leu Leu Ala Leu Val Val Phe Ala Val Gly 
130 135 140 

lie Val Gly Asn Leu Ser Val Met Cys He Val Trp His Ser Tyr Tyr 

155 160 
Leu Lys Ser Ala Trp Asn Ser He Leu Ala Ser Leu Ala Leu Trp Asp 



170 175 



Phe Leu Val Leu Phe Phe Cys Leu Pro He Val He Phe 



180 185 



Asn Glu He 
190 



Thr Lys Gin Arg Leu Leu Gly Asp Val Ser Cys Arg Ala Val Pro Phe 
195 200 205 

Met Glu val Ser Ser Leu Gly Val Thr Thr Phe Ser Leu Cys Ala Leu 

215 220 

Gly He Asp Arg Phe His Val Ala Thr Ser Thr Leu Pro Lys Val Arg 

230 235 240 

Pro He Glu Arg Cys Gin Ser He Leu Ala Lys Leu Ala Val He Trp 
245 250 255 

val Gly Ser Met Thr Leu Ala Val Pro Glu Leu Leu Leu Trp Gin Leu 
260 265 270 

Ala Gin Glu Pro Ala Pro Thr Mp^ nv tv,-^ t_. » 

~.c« f£o inr net Gly Thr Leu Asp Ser Cys He Met 
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275 280 285 

Lys Pro Ser Ala Ser Leu Pro Glu Ser Leu Tyr Ser Leu Val Met Thr 
290 295 300 

Tyr Gin Asn Ala Arg Met Trp Trp Tyr Phe Gly Cys Tyr Phe Cys Leu 
5 305 310 315 320 

Pro lie Leu Phe Thr Val Thr Cysf Gin Leu Val Thr Trp Arg Val Arg 
325 330 335 

Gly Pro Pro Gly Arg Lys Ser Glu Cys Arg Ala Ser Lys His Glu Gin 
340 345 350 

10 Cys Glu Ser Gin Leu Asn Ser Thr Val Val Gly Leu Thr Val Val Tyr 

355 360 3*65 

Ala Phe Cys Thr Leu Pro Glu Asn Val Cys Asn He Val Val Ala Tyr 
370 375 380 

Leu Ser Thr Glu Leu Thr Arg Gin Thr Leu Asp Leu Leu Gly Leu He 
15 385 390 395 400 

Asn Gin Phe Ser Thr Phe Phe Lys Gly Ala He Thr Pro Val Leu Leu 
405 410 415 

Leu Cys He Cys Arg Pro Leu Gly Gin Ala Phe Leu Asp Cys Cys Cys 
420 425 430 

20 Cys Cys Cys Cys Glu Glu Cys Gly Gly Ala Ser Glu Ala Ser Ala Ala 

435 440 445 

Asn Gly Ser Asp Asn Lys Leu Lys Thr Glu Val Ser Ser Ser He Tyr 
450 455 460 

Phe His Lys Pro Arg Glu Ser Pro Pro Leu Leu Pro Leu Gly Thr Pro 
25 465 470 475 480 

Cys 



(84) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:83! 



ATGTGGAACG CGACGCCCAG CG 



22 
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(85) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: .DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION : . SEQ ID NO: 84: 
TCATGTATTA ATACTAGATT CT 

22 

10 (86) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85 

TACCATGTGG AACGCGACGC CCAGCGAAGA GCCGGGGT 
(87) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRT^EDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE' DESCRIPTION: SEQ ID NO: 86: 

CGGAATTCAT GTATTAATAC TAGATTCTGT CCAGGCCCG 
(88) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHTUiACTERISTICS : 

(A) LENGTH: 1101 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



38 



39 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
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ATGTGGAACG CGACGCCCAG CGAAGAGCCG GGGTTCAACC TCACACTGGC CGACCTGGAC 60 

TGGGATGCTT CCCCCGGCAA CGACTCGCTG GGCGACGAGC TGCTGCAGCT CTTCCCCGCG 120 

CCGCTGCTGG CGGGCGTCAC AGCCACCTGC GTGGCACTCT TCGTGGTGGG TATCGCTGGC 180 

AACCTGCTCA CCATGCTGGT GGTGTCGCGC TTCCGCGAGC TGCGCACCAC CACCAACCTC 240 

TACCTGTCCA GCATGGCCTT CTCCGATCTG CTCATCTTCC TCTGCATGCC CCTGGACCTC 300 

GTTCGCCTCT GGCAGTACCG GCCCTGGAAC TTCGGCGACC TCCTCTGCAA ACTCTTCCAA 360 

TTCGTCAGTG AGAGCTGCAC CTACGCCACG GTGCTCACCA TCACAGCGCT GAGCGTCGAG 420 

CGCTACTTCG CCATCTGCTT CCCACTCCGG GCCAAGGTGG TGGTCACCAA GGGGCGGGTG 480 

AAGCTGGTCA TCTTCGTCAT CTGGGCCGTG GCCTTCTGCA GCGCCGGGCC CATCTTCGTG 540 

CTAGTCGGGG TGGAGCACGA GAACGGCACC GACCCTTGGG ACACCAACGA GTGCCGCCCC 600 

ACCGAGTTTG CGGTGCGCTC TGGACTGCTC ACGGTCATGG TGTGGGTGTC CAGCATCTTC 660 

TTCTTCCTTC CTGTCTTCTG TCTCACGGTC CTCTACAGTC TCATCGGCAG GAAGCTGTGG 720 

CGGAGGAGGC GCGGCGATGC TGTCGTGGGT GCCTCGCTCA GGGACCAGAA CCACAAGCAA 780 

ACCGTGAAAA TGCTGGCTGT AGTGGTGTTT GCCTTCATCC TCTGCTGGCT CCCCTTCCAC 840 

GTAGGGCGAT ATTTATTTTC CAAATCCTTT GAGCCTGGCT CCTTGGAGAT TGCTCAGATC 900 

AGCCAGTACT GCAACCTCGT GTCCTTTGTC CTCTTCTACC TCAGTGCTGC CATCAACCCC 960 

ATTCTGTACA ACATCATGTC CAAGAAGTAC CGGGTGGCAG TGTTC7VGACT TCTGGGATTC 1020 

GAACCCTTCT CCCAGAGAAA GCTCTCCACT CTGAAAGATG AAAGTTCTCG GGCCTGGACA 1080 

GAATCTAGTA TTAATACATG A XlOl 
(89) INFORMATION FOR SEQ ID NO:88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 366 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Met Trp Asn Ala Thr Pro Ser Glu Glu Pro Gly Phe Asn Leu Thr Leu 
15 10 15 

Ala Asp Leu Asp Trp Asp Ala Ser Pro Gly Asn Asp Ser Leu Gly Asp 
20 25 30 



wo 00/22129 

PCT/US99/23938 

68 

Glu Leu Leu Gin Leu Phe Pro Ala Pro Leu Leu Ala Gly Val Thr Ala 



40 



45 

Thr cys val Ala Leu Phe Val Val Gly He Ala Gly Asn Leu Leu Thr 



55 



60 

Met Leu Val Val Ser Arg Phe Arg Glu Leu Arg Thr Thr Thr Asn Leu 

^° 75 80 

Tyr Leu Ser Ser Met Ala Phe Ser Asp Leu Leu He Phe Leu Cys Met 
85 so 95 

Pro Leu Asp Leu Val Arg Leu Trp Gin Tyr Arg Pro Trp Asn Phe Gly 
100 105 

ASP Leu Leu Cys Lys Leu Phe Gin Phe Val Ser Glu Ser Cys Thr l^r 

120 225 

Ala Thr val Leu Thr He Thr Ala Leu Ser Val Glu Arg Tyr Phe Ala 

140 

lie cys Phe Pro Leu Arg Ala Lys Val Val Val Thr Lys Gly Arg Val 

"° 160 

Lys Leu val He Phe Val He Trp Ala Val Ala Phe Cys Ser Ala Gly 
les 175 

Pro He Phe Val Leu Val Gly Val Glu His Glu Asn Gly Thr Asp Pro 
"0 185 

Trp Asp Thr Asn Glu Cys Arg Pro Thr Glu . Phe Ala Val Arg Ser- Gly 



205 

Leu Leu Thr Val Met Val Trp Val Ser Ser He Phe Phe Phe Leu Pro 

215 220 

val Phe cys Leu Thr Val Leu Tyr Ser Leu He Gly Arg Lys Leu Trp 

235 240 
Asp Ala Val Val Gly Ala Ser i.^n 

245 



Arg Arg- Arg Arg Gly Asp Ala Val Val Gly Ala Ser Leu Arg Asp Gin 
245 250 255 

Asn His Lys Gin Ti^ Val Lys Met Leu Ala Val Val Val Phe Ala Phe 

270 

He Leu cys Trp Leu Pro Phe His Val Gly Arg Tyr Leu Phe Ser Lys 

280 285 

ser Phe Glu Pro Gly Ser Leu Glu He Ala Gin He Ser Gin Tyr Cys 

255 

Asn Leu val Ser Phe Val Leu Phe .^r Leu Ser Ala Ala He Asn Pro 

^•^^ 320 
He Leu Tyr Asn He Met Ser Lys Lys Tyr Arg Val Ala Val Phe Arg 
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325 330 335 

Leu Leu Gly Phe Glu Pro Phe Ser Gin Arg Lys Leu Ser Thr Leu Lys 
340 . 345 350 

Asp Glu Ser Ser Arg Ala Trp Thr Glu Ser Ser lie Asn Thr 
355 360 365 

(90) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 b&se pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
GCAAGCTTGT GCCCTCACCA AGCCATGCGA GCC 33 

(91) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
CGGAATTCAG CAATGAGTTC CGACAGAAGC 30 

(92) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 91: 

ATGCGAGCCC CGGGCGCGCT TCTCGCCCGC ATGTCGCGGC TACTGCTTCT GCTACTGCTC 60 

AAGGTGTCTG CCTCTTCTGC CCTCGGGGTC GCCCCTGCGT CCAGAAACGA AACTTGTCTG 120 

GGGGAGAGCT GTGCACCTAC AGTGATCCAG CGCCGCGGCA GG<3ACGCCTG GGGACCGGGA 180 



AATTCTGCAA GAGACGTTCT GCGAGCCCGA GCACCCAGGG AGGAGCAGGG GGCAGCGTTT 240 
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CTTCCGGGAC CCTCCTGGGA CCTCCCGGCG GCCCCGGGCC GTCACCCGGC TGCAGGCAGA 300 
<WgGAGG CGTCGGCAGC CGGACCCCCG GGACCTCCAA CCAGGCCACC TGGCCCCTGG 360 
AGGTGGAAAG GTGCTCGGGG TCAGGAGCCT TCTGAAACIT TGGGGAGAGG GAACCCCACG 420 
GCCCTCCAGC TCTTCCTTCA GATCTCAGAG GAGGAAGAGA AGGGTCCCAG AGGCGCl^C 480 
5 ATTTCCGGGC GTAGCCAGGA GCAGAGTCTG AAGACAGTCC CCGGAGCCAG CGATCTO 540 
TACTGGCCAA GGAGAGCCGG GAAACTCCAG GG^TCCCACC ACAAGCCCCT GTCCAAGACG 600 
GCCAATGGAC TGGCGGGGCA CGAAGGGTGG ACAAOTGCAC TCCCGGGCCG GGCGCTCGCC 660 
CAGAATGGAT CCTTGGGTGA AGGAATCCAT 6AGCCTCGGG GTCCCCGCCG GGGAAACAGC 720 
ACGAACCGGC GTGTGAGACT GAAGAACCCC TTCTACCCGC TOACCCAGGA GTCCTATOGA 780 
10 GCCTAC^CGG TCATGTGTCT GTCOn^GTG ATCTTCGGGA CCGGCATCAT ...G^^^^ 840 
GCGGTGATGA GCATCGTGTG CCACAACTAC TACATGCGGA GCATCTCCAA CTCCCTCTTO 900 
GCCAACCTGG CCTTCTGGGA CTITCTCATC ATCTTCTTCT GCCTTCCGCT GGTCATCTTC 960 
CACOAGCTGA CCAAGAAQTG GCTGCTGGAG GACTTCTCCT GCAAGATCGT GCCCTATATA 1020 
GAGGTCGCTT CTCTGGGAGT CACCACTTTC ACCTTATGTG CTCTGTGCAT AGACCGCTTC 1080 
15 OGTCCTOCCA CCAACGTACA GATGTACTAC GAAATGATCG AAAACTGrrC CTCAAC^^ 1140 
GCCAAACTTG CTGTTATATG GGTGGGAGCT CTATTGTTAG CACTTCCAGA AGl^ITCTC 1200 
CGCCAGCTGA GCAAGGAGGA TTTGGGGrtT AGl^CCGAG CTCCGGCAGA AAGGl^CATT 1260 
ATTAAGATCT CTCCTGATTT ACCAGACACC ATCTATGTTC TAGCCCTCAC CTACGACAGT 1320 
GCGAGACTGT GGTGGTATTT TGGCTGTTAC TTTTGTTTGC CCACGCTTTT CACCATCACC 1380 
20 ^CTCTCTAG TGACTGCGAG GAAAATCCGC AAAGCAGAGA AAGCCTGTAC CCGAGGG^ 1440 
AAACOGCAGA ITCAACTAGA GAGTCAGATG AACTGTACAG TAG-K3GCACT GACCATTTTA 1500 
TATGGATTTT GCATTAITCC TGAAAATATC TGCAACATTG TTACTGCCTA CAl^CTACA 1560 
GGGGTTTCAC AGCAGACAAT GGACCTCCTT AATATCATCA GCCAGTTCCT TTTGTTCTTT 1620 
AAGTCCTGTC TCACCCCAGT CCTCCTTTTC TGTCTCTOCA AACCCTTCAG TCGGGCCTTC 1680 
25 ATGGAGTGCT GCa<Kn«TTG CTGl^GGAA O^SCAWCAGA AGTCITCAAC GGl^^ 1740 
GATGACAATG ACAACGAGTA CACCACGGAA CTCGAACTCT CGCCTTTCAG TACCATACGC 1800 
CGTGAAATGT CCACTTTTGC TTCTGTCGGA ACTCATTGCT OA ^^842 
(93) INFOBMATION FOR SEQ ID NO: 92: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 



Met Arg Ala Pro Gly Ala Leu Leu Ala Arg Met Ser Arg Leu Leu Leu 
1 5 10 15 

Leu Leu Leu Leu Lys Val Ser Ala Ser Ser Ala Leu Gly Val Ala Pro 
20 25 30 

Ala Ser Arg Asn Glu Thr Cys Leu Gly Glu Ser Cys Ala Pro Thr Val 
35 40 45 

He Gin Arg Arg Gly Arg Asp Ala Trp Gly Pro Gly Asn Ser Ala Arg 
50 55 60 

Asp Val Leu Arg Ala Arg Ala Pro Arg Glu Glu Gin Gly Ala Ala Phe 
^5 70 75 80 

Leu Ala Gly Pro Ser Trp Asp Leu Pro Ala Ala Pro Gly Arg Asp Pro 
85 90 95 

Ala Ala Gly Arg Gly Ala Glu Ala Ser Ala Ala Gly Pro Pro Gly Pro 
100 105 110 

Pro Thr Arg Pro Pro Gly Pro Trp Arg Trp Lys Gly Ala Arg Gly Gin 
lis 120 125 

Glu Pro Ser Glu Thr Leu Gly Arg Gly Asn Pro Thr Ala Leu Gin Leu 
130 135 140 

Phe Leu Gin He Ser Glu Glu Glu Glu Lys Gly Pro Arg Gly Ala Gly 
145 150 155 160 

He Ser Gly Arg Ser Gin Glu Gin Ser Val Lys Thr Val Pro Gly Ala 
165 170 175 

Ser Asp Leu Phe Tyr Trp Pro Arg Arg Ala Gly Lys Leu Gin Gly Ser 
180 185 190 

His His Lys Pro Leu Ser Lys Thr Ala Asn Gly Leu Ala Gly His Glu 
195 200 205 

Gly Trp Thr He Ala Leu Pro Gly Arg Ala Leu Ala Gin Asn Gly Ser 
210 215 220 



Leu Gly Glu Gly He His Glu Pro Gly Gly Pro Arg Arg Gly Asn Ser 
225 230 235 240 
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Thr Asn Arg Arg Val Arg Leu Lys.Asn Pro Phe Tyr Pro Leu Thr Gin 
245 250 255 

Glu Ser Tyr Gly Ala Tyr Ala Val Met Cys Leu Ser Val 



265 270 

Gly Thr Gly He He Gly Asn Leu Ala Val Met Ser He Val Cys His 
275 280 285 

Asn ^ Tyr Met Arg Ser lie Ser Asn Ser Leu Leu Ala Asn Leu Ala 

295 

Phe Trp ASP Phe Leu lie He Phe Phe Cys Leu Pro Leu Val He Phe 

315 320 

His Glu Leu Thr Lys Lys Trp Leu Leu Glu Asp Phe Ser Cys Lys He 
325 330 

val Pro Tyr He Glu Val Ala Ser Leu Gly Val Thr Thr Phe Thr Leu 

345 350 

cys Ala Leu Cys He Asp Arg Phe Arg Ala Ala Thr Asn Val Gin Met 
355 360 365 

Tyr Tyr Glu Met He Glu Asn Cys Ser Ser Thr Thr Ala Lys Leu Ala 

380 

val He Trp Val Gly Ala Leu Leu Leu Ala Leu Pro Glu Val Val Leu 
" 390 395 

Arg Gin Leu Ser Lys Glu Asp Leu Gly Phe Ser Gly Arg Ala Pro Ala 



405 



410 



415 



Glu Arg Cys He He Lys He Ser Pro Asp Leu Pro Asp Thr He Tyr 
420 425 



430 



Gly 



val Leu Ala Leu Thr Tyr Asp Ser Ala Arg Leu Trp Trp Tyr Phe 
«5 440 445 

cys Tyr Phe Cys Leu Pro Thr Leu Phe Thr He Thr Cys Ser Leu Val 
• '*55 460 

Thr Ala Arg Lys He Arg Lys Ala Glu Lys Ala Cys Thr Arg Gly Asn 

470 475 ^ 



480 



Lys Arg Gin He Gin Leu Glu Ser Gin Met Asn Cys Thr Val Val Ala 
"5 490 495 

Leu Thr He Leu Tyr Gly Phe Cys He He Pro Glu Asn He Cys Asn 
500 505 

He val Thr Ala Tyr Met Ala Thr Gly Val Ser Gin Gin Thr Met Asp 
515 520 525 ^ 

Leu Leu Asn He He Ser. Gin Phe Leu Leu Phe Phe Lys Ser Cys Val 
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530 535 540 

Thr Pro Val Leu Leu Phe Cys Leu Cys Lys Pro Phe Ser Arg Ala Phe 
545 550 555 560 

Met Glu Cys Cys Cys Cys Cys Cys Glu Glu Cys He Gin Lys Ser Ser 
565 570 575 

Thr Val Thr Ser Asp Asp Asm Asp Asn Glu Tyr Thr Thr Glu Leu Glu 
580 585 590 

Leu Ser Pro Phe Ser Thr He Arg Arg Glu Met Ser Thr Phe Ala Ser 
595 600 605 

Val Gly Thr His Cys 
610 



(94) INFORMATION FOR SEQ ID NO: 93: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93; 
CAGAATTCAG AGAAAAAAAG TGAATATGGT TTTT 
(95) INFORMATION FOR SEQ ID NO: 94: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
TTGQATCCCT GGTGCATAAC AATTGAAAGA AT 
(96) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1248 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 95: 
ATGGTTTTTG CTCACAGAAT GGATAACAGC WIGCCACATT TCATTATTCC TACACTTCTC 60 
GTGCCCCTCC AAAACCGCVG CTGCACTGAA ACAGCCACAC CTCTGCCAAG CCAATACCTG 120 
ATGGAATTAA GTGAGGAGCA CAGTTGGATG AGCAACCAAA CAGACCITCA CTATGTGCT6 180 
5 AftACCCGGGG AAGTGGCCAC AGCCAGCATC TTCTTTGGGA TTCTGTCGTT GTTTTCTATC 240 
TTCGGCAArr CCCTGGTTTG TTTGGTCATC CATAGGAGTA GGAGGACTCA GTCTACCACC 300 
AACTACTTTG TGGTCTCCAT GGCATGTGCT GACCTTCTCA TCAGCGTTGC CA6CACGCCT 360 
TTCGTCCTGC TCCAGTTCAC CACTGGAAGG TGGACGCTCG GTAGTGCAAC 6T6CAAGGTT 420 
GTGCGATATT TTCAATATCT CACTCCAGGT GTCCAGATCT ACGTTCTCCT CTCCATCTGC 480 
10 ATAGACCGGT TCTACACCAT CGTCTATCCT CTGAGCTTCA AGGTGTCCAG AGAAAAAGCC 540 
AAGAAAATGA TTGCGGCATC GTGGATCTTT GATGCAGGCT TTGTGACCCC TGTGCTCTTT 600 
TTCTATGGCT CCAACTGGGA CAGTCATT6T AACTATTTCC TCCCCTCCTC TTGGGAAGGC 660 
ACTGCCTACA CTGTCATCCA CTTCTTGGTG GGCTTTGTCA TTCCATCTGT CCTCATAATT 720 
TTATTTTACC AAAAGGTCAT AAAATATATT TGGAGAATAG GCACAGATGG CCGAACGGTG 780 
15 AG6AGGACAA TGAACATTGT CCCTCGGACA AAAGTGAAAA CTATCAAGAT GTTCCTCATT 840 
TTAAATCTGT TGTTTTTGCT CTCCTGGCTG CCTTTTCATG TAGCTCAGCT ATGGCACCCC 900 
GATGAACAAG ACTATAAGAA AAQTTCCCTT GTTTTCACAQ CTATCACATG GATATCCTTT 960 
AGTTCTTCAG CCTCTAAACC TACTCTGTAT TCAATTTATA ATGCCAATTT TCGGAGAGGG 1020 
ATGAAAGAGA CTTTTTGCAT GTCCTCTATG AAATGTTACC GAAGCAATGC CTATACTATC 1080 
20 ACAACAAGTT CAAGGATGGC CAAAAAAAAC TACGTTGGCA TTTCAGAAAT CCCTTCCATC 1140 
GCCAAAACTA TTACCAAAGA CTCGATCTAT GACTCATTTG ACAGAGAAGC CAAGGAAAAA 1200 
AA6CTTGCTT GGCCCATTAA CTCAAATCCA CCAAATACTT TTGTCTAA 1243 
(97) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 415 amino acids 

(B) TYPE: amino acid 

(C) STRANDBDNESS : 

(D) TOPOLOGY: not relevant 



30 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
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Met Val Phe Ala His Arg Met Asp Asn Ser Lys Pro His Leu lie lie 
15 10 15 

Pro Thr Leu Leu Val Pro Leu Gin Asn Arg Ser Cys Thr Glu Thr Ala 
20 25 30 

Thr Pro Leu Pro Ser Gin Tyr Leu Met Glu Leu Ser Glu Glu His Ser 
35 40 45 

Trp Met Ser Asn Gin Thr Asp Leu His Tyr Val Leu Lys Pro Gly Glu 
50 55 60 

Val Ala Thr Ala Ser He Phe Phe Gly He Leu Trp Leu Phe Ser He 
65 70 75 80 

Phe Gly Asn Ser Leu Val Cys Leu Val He His Arg Ser Arg Arg Thr 
85 90 ^ 95 

Gin Ser Thr Thr Asn Tyr Phe Val Val Ser Met Ala Cys Ala Asp Leu 
100 105 110 

Leu He Ser Val Ala Ser Thr Pro Phe Val Leu Leu Gin Phe Thr Thr 
115 120 125 

Gly Arg Trp Thr Leu Gly Ser Ala Thr Cys Lys Val Val Arg Tyr Phe 
130 135 140 

Gin Tyr Leu Thr Pro Gly Val Gin He Tyr Val Leu Leu Ser He Cys 
145 150 155 160 

He Asp Arg Phe Tyr Thr He Val Tyr Pro Leu Ser Phe Lys Val Ser 
165 170 175 

Arg Glu Lys Ala Lys Lys Met He Ala Ala Ser Trp He Phe Asp Ala 
180 185 190 

Gly Phe Val Thr Pro Val Leu Phe Phe Tyr Gly Ser Asn Trp Asp Ser 
195 200 205 

His Cys Asn Tyr Phe Leu Pro Ser Ser Trp Glu Gly Thr Ala Tyr Thr 
210 215 220 

Val He His Phe Leu Val Gly Phe Val He Pro Ser Val Leu He He 
225 230 235 240 

Leu Phe Tyr Gin Lys Val He Lys Tyr He Trp Arg He Gly Thr Asp 
245 250 255 

Gly Arg Thr Val Arg Arg Thr Met Asn He Val Pro Arg Thr Lys Val 
260 265 270 

Lys Thr He Lys Met Phe Leu He Leu Asn Leu Leu Phe Leu Leu Ser 
275 280 285 
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Trp Leu Pro Phe His Val Ala Gin Leu Trp His Pro His Glu Gin Asp 

Tyr Lys Lys Ser Ser Leu Val Phe Thr Ala He Thr Trp He Ser Phe 

^ 310 '•1C 

"•^^ 320 

ser ser Ser Ala Ser Lys Pro Thr Leu Tyr Ser He Tyr Asn Ala Asn 



330 



10 



Phe Arg Arg Gly Met Lys Glu Thr Phe Cys Met Ser Ser Met Lys Cys 

340 "^AK 

-^^^ 350 

Tyr Arg Ser Asn Ala Tyr Thr He Thr Thr Ser Ser Arg Met Ala Lys 
355 

Lys Asn Tyr Val Gly He Ser Glu He Pro Ser Met Ala Lys Thr He 



380 



Thr Lys Asp Ser He Tyr Asp Ser Phe Asp Arg Glu Ala Lys Glu Lys 



395 



400 



15 



Lys Leu Ala Trp Pro He Asn Ser Asn Pro Pro Asn Thr Phe Val 
405 



415 



(98) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
GGAAAGCTTA ACGATCCCCA GGAGCAACAT 
(99) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
CTGGGATCCT ACGAGAGCAT TTTTCACACA G 
(100) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 



30 



31 
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(A) LENGTH: 1842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

ATGGGGCCCA CCCTAGCGGT TCCCACCCCC TATGGCTGTA TTGGCTGTAA GCTACCCCAG 60 

CCAG7VATACC CACCGGCTCT AATCATCTTT ATGTTCTGCG CGATGGTTAT CACCATCGTT 120 

GTAGACCTAA TCGGCAACTC CATGGTCATT TTGGCTGTGA CGAAGAACAA GAAGCTCCGG 180 

AATTCTGGCA ACATCTTCGT GGTCAGTCTC TCTGTGGCCG ATATGCTGGT GGCCATCTAC 240 

CCATACCCTT TGATGCTGCA TGCCAT6TCC ATTGGGGGCT GGGATCTGAG CCAGTTACAG 300 

TGCCAGATGG TCGGGTTCAT CACAGGGCTG AGTGTGGTCG GCTCCATCTT CAACATCGTG 360 

GCAATCGCTA TCAACCGTTA CTGCTACATC TGCCACAGCC TCCAGTACGA ACGGATCTTC 420 

AGTGTGCGCA ATACCTGCAT CTACCTGGTC ATCACCTGGA TCATGACCGT CCTGGCTGTC 480 

CTGCCCAACA TGTACATTGG CACCATCGAG TACGATCCTC GCACCTACAC CTGCATCTTC 540 

AACTATCTGA ACAACCCTGT CTTCACTGTT ACCATCGTCT GCATCCACTT CGTCCTCCCT 600 

CTCCTCATCG TGGGTTTCTG CTACGTGAGG ATCTGGACCA AAGTGCTGGC GGCCCGTGAC 660 

CCTGCAGGGC AGAATCCTGA CAACCAACTT GCTGAGGTTC GCAATTTTCT AACCATGTTT 720 

GTGATCTTCC TCCTCTTTGC AGTGT6CTGG TGCCCTATCA ACGTGCTCAC TGTCTTGGTG 780 

GCTGTCAGTC CGAAGGAGAT GGCAGGCAAG ATCCCCAACT GGCTTTATCT TGCAGCCTAC 840 

TTCATAGCCT ACTTCAACAG CTGCCTCAAC GCTGTGATCT ACGGGCTCCT CAATGAGAAT 900 

TTCCGAAGAG AATACTGGAC CATCTTCCAT GCTATGCGGC ACCCTATCAT ATTCTTCCCT 960 

GGCCTCATCA GTGATATTCG TGAGATGCAG GAGGCCCGTA CCCTGGCCCG CGCCCGTGCC 1020 

CATGCTCGCG ACCAAGCTCG T6AACAAGAC CGTGCCCATG CCTGTCCTGC TGTGGAGGAA 1080 

ACCCCGATGA ATGTCCGGAA TGTTCCATTA CCTGGTGATG CTGCAGCTGG CCACCCCGAC 1140 

CGTGCCTCTG GCCACCCTAA GCCCCATTCC AGATCCTCCT CTGCCTATCG CAAATCTGCC 1200 

TCTACCCACC ACAAGTCTGT CTTTAGCCAC TCCAAGGCTG CCTCTGGTCA CCTCAAGCCT 1260 

GTCTCTGGCC ACTCCAAGCC TGCCTCTGGT CACCCCAAGT CTGCCACTGT CTACCCTAAG 1320 

CGTGCCTCTG TCCATTTCAA GGGTGACTCT GTCCATTTCA AGGGTGACTC TGTCCATTTC 1380 
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AAGCCTGACT CTGTTCATTT CAAGCCTGCT TCCAGCAACC CCAAGCCCAT CACTGGCCAC 1440 

CATGTCTCTG CTGGCAGCCA CTCCAAGTCT GCCTTCAGTG CTGCCACCAG CCACCCTAAA 1500 

CCCATCAAGC CAGCTACCAG CCATGCTGAG CCCACCACTG CTGACTATCC CAAGCCTGCC 15C0 

ACTACCAGCC ACCCTAAGCC CGCTGCTGCT GACAACCCTG AGCTCTCTGC CTCCCATTGC 1620 

CCCGAGATCC CTGCCATTGC CCACCCTGTG TCTGACGACA GTGACCTCCC TGAGTCGGCC 1680 

TCTAGCCCTG CCGCTGGGCC CACCAAGCCT GCTGCCAGCC A6CTGGAGTC TGACACCATC 1740 

GCTGACCTTC CTGACCCTAC TGTAGTCACT ACCAGTACCA ATGATTACCA TGATGTCGTG 1800 

GTTGTTGATG TTGAAGATGA TCCTGATGAA ATGGCTGTGT GA 1842 
(101) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Met Gly Pro Thr Leu Ala Val Pro Thr Pro Tyr Gly Cys He Gly Cys 
^ 5 10 15 

Lys Leu Pro Gin Pro Glu Tyr Pro Pro Ala Leu He He Phe Met Phe 
'20 25 30 

Cys Ala Met Val He Thr He Val Val Asp Leu He Gly Asn Ser Met 
35 40 45 

Val He Leu Ala Val Thr Lys Asn Lys Lys Leu Arg Asn Ser Gly Asn 
50 55 60 

He Phe Val Val Ser Leu Ser Val Ala Asp Met Leu Val Ala He Tyr 
^5 70 .75 80 ■ 

Pro Tyr Pro Leu Met Leu His Ala Met Ser He Gly Gly Trp Asp Leu 
85 90 95 

Ser Gin Leu Gin Cys Gin Met Val Gly Phe He Thr Gly Leu Ser Val 
100 105 110 

Val Gly Ser He Phe Asn He Val Ala He Ala He Asn Arg Tyr Cys 
115 120 125 

Tyr He Cys His Ser Leu Gin Tyr Glu Arg He Phe Ser Val Arg Asn 
130 135 140 
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Thr Cys lie Tyr Leu Val He Thr Trp He Met Thr Val Leu Ala Val 
145 150 155 160 

Leu Pro Asn Met Tyr He Gly Thr He Glu Tyr Asp Pro Arg Thr Tyr 
165 170 175 

Thr Cys He Phe Asn Tyr Leu Asn Asn Pro Val Phe Thr Val Thr He 
180 185 190 

Val Cys He His Phe Val Leu Pro Leu Leu He Val Gly Phe Cys Tyr 
195 200 205 

Val Arg He Trp Thr Lys Val Leu Ala Ala Arg Asp Pro Ala Gly Gin 
210 215 220 

Asn Pro Asp Asn Gin Leu Ala Glu Val Arg Asn Phe Leu Thr Met Phe 
225 230 235 240 

Val He Phe Leu Leu Phe Ala Val Cys Trp Cys Pro He Asn Val Leu 
245 250 255 

Thr Val Leu Val Ala Val Ser Pro Lys Glu Met Ala Gly Lys He Pro 
260 265 270 

Asn Trp Leu Tyr Leu Ala Ala Tyr Phe He Ala Tyr Phe Asn Ser Cys 
275 280 285 

Leu Asn Ala Val He Tyr Gly Leu Leu Asn Glu Asn Phe Arg Arg Glu 
290 295 300 

Tyr Trp Thr He Phe His Ala Met Arg His Pro He. He Phe Phe Pro 
305 310 315 320 

Gly Leu He Ser Asp He Arg Glu Met Gin Glu Ala Arg Thr Leu Ala 
325 330 ^ 335 

Arg Ala Arg Ala His Ala Arg Asp Gin Ala Arg Glu Gin Asp Arg Ala 
340 345 350 

His Ala Cys Pro Ala Val Glu Glu Thr Pro Met Asn Val Arg Asn Val 
.355 360 365 

Pro Leu Pro Gly Asp Ala Ala Ala Gly His Pro Asp Arg Ala Ser Gly 
370 375 • 380 

His Pro Lys Pro His Ser Arg Ser Ser Ser Ala Tyr Arg Lys Ser Ala 
385 390 395 400 

Ser Thr His His Lys Ser Val Phe Ser His Ser Lys Ala Ala Ser Gly 
405 410 415 

His Leu Lys Pro Val Ser Gly His Ser Lys Pro Ala Ser Gly His Pro 
420 425 430 



Lys Ser TQa Thr Val Tyr Pro Lys Pro Ala Ser Val His Phe Lys Gly 
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80 

445 



10 



Asp Ser val His Phe Lys Gly Asp Ser Val His Phe Lys Pro Asp Ser 
450 455 

val His Phe Lys Pro Ala Ser Ser Asn Pro Lys Pro He Thr Gly His 

470 475 480 

His Val Ser Ala Gly Ser His Ser Lys Ser Ala Phe Ser Ala Ala Thr 
485 490 

Ser His Pro Lys Pro He Lys Pro Ala Thr Ser His Ala Glu Pro Thr 

505 510 

Thr Ala Asp Tyr Pro Lys Pro Ala Thr Thr Ser His Pro Lys Pro Ala 

520 525 

Ala Ala Asp Asn Pro Glu Leu Ser Ala Ser His Cys Pro Glu He Pro 
"° 535 540 

Ala He Ala His Pro Val Ser Asp Asp Ser Asp Leu Pro Glu Ser Ala 

550 555 560 

ser Ser Pro Ala Ala Gly Pro Thr Lys Pro Ala Ala Ser Gin Leu Glu 
565 570 

Ser Asp Thr He Ala Asp Leu Pro Asp Pro Thr Val Val Thr Thr Ser 
580 585 590 

Thr Asn Asp Tyr His Asp Val Val Val Val Asp Val Glu Asp Asp Pro . 



595 600 

Asp Glu Met Ala Val 
610 

(102) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101; 
TCCAAGCTTC GCCATGGGAC ATAACGGGAG CT 
(103) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



605 



32 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
CGTGAATTCC AAGAATTTAC AATCCTTGCT 2 
(104) INFORMATION FOR SEQ ID NO: 103: 



(i) 


SEQUENCE CHARACTERISTICS: 








(A) LENGTH: 1548 base pairs 








(B) TYPE: nucleic acid 








(C) STRANDEDNESS : single 








(D) TOPOLOGY: linear 






(ii) 


MOLECULE TYPE: DNA (genomic) 






(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 103: 






ATGGGACATA ACGGGAGCTG GATCTCTCCA AATGCCAGCG AGCCGCACAA 


CGCGTCCGGC 


60 


GCCGAGGCTG 


CGGGTGTGAA CCGCAGCGCG CTCGGGGAGT TCGGCGAGGC 


GCAGCTGTAC 


120 


CGCCAGTTCA 


CCACCACCGT GCAGGTCGTC ATCTTCATAG GCTCGCTGCT 


CGGAAACTTC 


180 


ATGGTGTTAT 


GGTCAACTTG CCGCACAACC GTGTTCAAAT CTGTCACCAA 


CAGGTTCATT 


240 


AAAAACCTG6 


CCTGCTCGGG GATTTGTGCC AGCCTGGTCT GTGTGCCCTT 


CGACATCATC 


300 


CTCAGCACCA 


GTCCTCACTG TTGCTGGTGG ATCTACACCA TGCTCTTCTG 


CAAGGTCGTC 


360 


AAATTTTTGC 


ACAAAGTATT CTGCTCTGTG ACCATCCTCA GCTTCCCTGC 


TATTGCTTTG 


420 


GACAGGTACT 


ACTCAGTCCT CTATCCACTG GAGAGGAAAA TATCTGATGC 


CAAGTCCCGT 


480 


GAACTGGTGA 


TGTACATCTG GGCCCATGCA GTGGTGGCCA GTGTCCCTGT 


GTTTGCAGTA 


540 


ACCAATGTGG 


CTGACATCTA TGCCACGTCC ACCTGCACGG AAGTCTGGAG 


CAACTCCTTG 


600 


GGCCACCTGG 


TGTACGTTCT GGTGTATAAC ATCACCACGG TCATTGTGCC 


TGTGGTGGTG 


660 


GTGTTCCTCT 


TCTTGATACT GATCCGACGG GCCCTGAGTG CCAGCCAGAA GAAGAAGGTC 


720 


ATCATAGCAG 


CGCTCCGGAC CCCACAGAAC ACCATCTCTA TTCCCTATGC 


CTCCCAGCGG 


780 


GAGGCCGAGC 


TGCACGCCAC CCTGCTCTCC ATGGTGATGG TCTTCATCTT 


GTGTAGCGTG 


840 


CCCTATGCCA 


CCCTGGTCGT CTACCAGACT GTGCTCAATG TCCCTGACAC 


TTCCGTCTTC 


900 


TTGCTGCTCA 


CTGCTGTTTG GCTGCCCAAA GTCTCCCTGC TGGCAAACCC 


TGTTCTCTTT 


960 


CTTACTGTGA 


ACAAATCTGT CCGCAAGTGC TTGATAGGGA CCCTGGTGCA 


ACTACACCAC 


1020 


CGGTACAGTC 


GCCGTAATGT GGTCAGTACA GGGAGTGGCA TGGCTGAGGC 


CAGCCTGGAA 


1080 
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CCCAGCATAC GCTCCCGTAC CCAGCTCCTG GAGATO^CC ACATOX^^CA GCAGCAGATC XX40 

rrTAAGCCCA CAGAGGATGA GGAAGAGAGT GAGGCCAAGT ACAI^GCTC AGCTGACrrc 1200 

CAGGCCAAGG AGATATTTAG CACCTGCCTC GAGGGAGAGC AGGGGCC^CA GT^^CGCCC 1260 

TCTGCCCCAC CCCTGAGCAC AGTGGACTCT GTATCCCAGG ^CACCGGC AGCCCCT^.^ 1320 

GAACCTGAAA CArrCCCTGA TAAGTArrCC CI^CAGTrTG GC^GGCC rTTTGAGT^G 1380 

CCTCCTCAGT GGCTCTCAGA GACCCGAAAC AGCAAGAAGC GGCTGCTCC CCCC.TGGGC 1440 

AACACCCCAG AAGAGCTGAT CCAGACAAAG GTGCCCAAGG TAGGCAGGGT GGAGCGGAAG 1500 
ATGAGCAGAA ACAATAAAGT GAGCATTTTT CCAAAGGTGG ATTCCTAG 

1548 

(105) INFORMATION FOR SEQ ID NO:104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 515 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:104: 

Met Gly His Asn Gly Ser Trp He Ser Pro Asn Ala Ser Glu Pro His 

^ ^0 15 

Asn Ala ser Gly Ala Glu Ala Ala Gly Val Asn Arg Ser Ala Leu Gly 

30 

Glu Phe Gly Glu Ala Gin Leu Tyr Arg Gin Phe Thr Thr Thr Val Gin 

40 45 

Val val lie Phe He Gly Ser Leu Leu Gly Asn Phe Met Val Leu Trp 

^5 60 
ser Thr C^s Arg Thr Thr Val Phe Lys Ser Val Thr Asn Arg Phe He 

80 

Lys Asn Leu Ala Cys Ser Gly He Cys Ala Ser Leu Val Cys Val Pro 

95 

Phe ASP He He Leu Ser Thr Ser Pro His Cys Cys Trp Trp He Tyr 

110 

Thr Met Leu Phe Cys Lys Val Val Lys Phe Leu His Lys Val Phe Cys 

•^^^ 125 

Ser Val Thr He Leu Ser Phe Pro Ala He A3;, tv * 

130 I-,,: ^^"^ Arg Tyr Tyr 

140 
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Ser Val Leu Tyr Pro Leu Glu Arg Lys He Ser Asp Ala Lys Ser Arg 
145 150 155 160 

Glu Leu Val Met Tyr He Trp Ala His Ala Val Val Ala Ser Val Pro 
165 170 175 

5 Val Phe Ala Val Thr Asn Val Ala Asp He Tyr Ala Thr Ser Thr Cys 

180 185 190 

Thr Glu Val Trp Ser Asn Ser Leu Gly His Leu Val Tyr Val Leu Val 
195 200 205 

Tyr Asn He Thr Thr Val He Val Pro Val Val Val Val Phe Leu Phe 
10 210 215 220 

Leu He Leu He Arg Arg Ala Leu Ser Ala Ser Gin Lys Lys Lys Val 
225 230 235 240 

He He Ala Ala Leu Arg Thr Pro Gin Asn Thr He Ser He Pro Tyr 
245 250 255 

15 Ala Ser Gin Arg Glu Ala Glu Leu His Ala Thr Leu Leu Ser Met Val 

260 265 270 

Met Val Phe He Leu Cys Ser Val Pro Tyr Ala Thr Leu Val Val Tyr 
275 280 285 

Gin Thr Val Leu Asn Val Pro Asp Thr Ser Val Phe Leu Leu Leu Thr 
20 290 295 300 

Ala Val Trp Leu Pro Lys Val Ser Leu Leu Ala Asn Pro Val Leu Phe 
305 310 315 320 

Leu Thr Val Asn Lys Ser Val Arg Lys Cys Leu He Gly Thr Leu Val 
325 330 335 

25 Gin Leu His His Arg Tyr Ser Arg Arg Asn Val Val Ser Thr Gly Ser 

340 345 350 

Gly Met Ala Glu Ala Ser Leu Glu Pro Ser He Arg Ser Gly Ser Gin 
355 360 365 

Leu Leu Glu Met Phe His He Gly Gin Gin Gin He Phe Lys Pro Thr 
30 370 375 380 

Glu Asp Glu Glu Glu Ser Glu Ala Lys Tyr He Gly Ser Ala Asp Phe 
385 390 395 400 

Gin Ala Lys Glu He Phe Ser Thr Cys Leu Glu Gly Glu Gin Gly Pro 
405 410 415 



35 



Gin Phe Ala Pro Ser Ala Pro Pro Leu Ser Thr Val Asp Ser Val Ser 
420 425 430 

Gin Val Ala Pro Ala Ala Pro Val Glu Pro Glu Thr Phe Pro Asp Lys 
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435 



445 

Tyr Ser Leu Gin Phe Gly Phe Gly Pro Phe Glu Leu Pro Pro Gin Trp 



460 

Leu ser Glu Thr Arg Asn Ser Lys Lys Arg Leu Leu Pro Pro Leu Gly 



480 

Asn Thr Pro Glu Glu Leu He Gin Thr Lys Val Pro Lys Val Gly Arg 
485 490 495 

val Glu Arg Lys Met Ser Arg Asn Asn Lys Val Ser He Phe Pro Lys 

505 siO 



10 



15 



Val Asp Ser 
515 

(106) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
20 GGAGAATTCA CTAGGCGAGG CGCTCCATC 

(107) INFORMATION FOR SEQ ID NO: 106:. 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 
^ (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

GGAGGATCCA GGAAACCTTA GGCCGAGTCC 
(108) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1164 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



29 



30 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

ATGAATCGGC ACCATCTGCA GGATCACTTT CTGGAAATAG ACAAGAAGAA CTGCTGTGTG 60 

TTCCGAGATG ACTTCATTGC CAAGGTGTTG CCGCCGGTGT TGGGGCTGGA GTTTATCTTT 120 

GGGCTTCTGG GCAATGGCCT TGCCCTGTGG ATTTTCTGTT TCCACCTCAA GTCCTGGAAA 180 

5 TCCAGCCGGA TTTTCCTGTT CAACCTGGCA GTAGCTGACT TTCTACTGAT CATCTGCCTG 240 

CCGTTCGTGA TGGACTACTA TGTGCGGCGT TCAGACTGGA ACTTTGGGGA CATCCCTTGC 300 

CGGCTGGTGC TCTTCATGTT TGCCATGAAC CGCCAGGGCA GCATCATCTT CCTCACGGTG 360 

GTGGCGGTAG ACAG6TATTT CCGGGTGGTC CATCCCCACC ACGCCCTGAA CAAGATCTCC 420 

AATTGGACAG CAGCCATCAT CTCTTGCCTT CTGTGGGGCA TCACTGTTGG CCTAACAGTC 480 

10 CACCTCCTGA AGAAGAAGTT GCTGATCCAG AATGGCCCTG CAAATGTGTG CATCAGCTTC 540 

AGCATCTGCC ATACCTTCCG GTGGCACGAA GCTATGTTCC TCCTGGAGTT CCTCCTGCCC 600 

CTGGGCATCA TCCTGTTCTG CTCAGCCAGA ATTATCTGGA GCCTGCGGCA GAGACAAATG 660 

GACCGGCATG CCAAGATCAA GAGAGCCATC ACCTTCATCA TGGTGGTGGC CATCGTCTTT 720 

GTCATCTGCT TCCTTCCCAG CGTGGTTGTG CGGATCCGCA TCTTCTGGCT CCTGCACACT 780 

15 TCGGGCACGC AGAATTGTGA AGTGTACCGC TCGGTGGACC TGGCGTTCTT TATCACTCTC 840 

AGCTTCACCT ACATGAACAG CATGCTGGAC CCCGTGGTGT ACTACTTCTC CAGCCCATCC 900 

TTTCCCAACT TCTTCTCCAC TTTGATCAAC CGCTGCCTCC AGAGGAAGAT GACAGGTGAG 960 

CCAGATAATA ACCGCAGCAC GAGCGTCGAG CTCACAGGGG ACCCCAACAA AACCAGAGGC 1020 

GCTCCAGAGG CGTTAATGGC CAACTCCGGT GAGCCATGGA GCCCCTCTTA TCTGGGCCCA 1080 

20 ACCTCAAATA ACCATTCCAA GAAGGGACAT TGTCACCAAG AACCAGCATC TCTGGAGAAA 1140 

CAGTTGGGCT GTTGCATCGA GTAA -^164 
(109) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 387 amino acids 

25 (B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:108: 
30 Met Asn Arg His His Leu Gin Asp His Phe Leu Glu He Asp Lys Lys 
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5 10 IS 

Asn Cys Cys Val Phe Arg Asp Asp Phe He Ala Lys Val Leu Pro Pro 
^° 25 • 30 



val Leu Gly Leu Glu Phe He Phe Gly Leu Leu Gly Asn Gly Leu Ala 
35 40 45 

Leu Trp lie Phe Cys Phe His Leu Lys Ser Trp Lys Ser Ser Arg He 
50 55 60 

Phe Leu Phe Asn Leu Ala Val Ala Asp Phe Leu Leu He He Cys Leu 

Pro Phe val Met Asp Tyr Tyr Val Arg Arg Ser Asp Trp Asn Phe Gly 
85 90 95 

Asp He Pro Cys Arg Leu Val Leu Phe Met Phe Ala Met Asn Arg Gin 



105 



110 



Gly ser He He Phe Leu Thr Val Val Ala Val Asp Arg Tyr Phe Arg 
115 120 125 

Val Val His Pro His His Ala Leu Asn Lys He Ser Asn Trp Thr Ala 

135 140 

Ala He He Ser Cys Leu Leu Trp Gly He Thr Val Gly Leu Thr Val 

150 155 

His Leu Leu Lys Lys Lys Leu Leu He Gin Asn Gly Pro Ala Asn Val 



165 



170 



175 



Cys He Ser Phe Ser He Cys His Thr Phe Arg Trp His Glu Ala 



180 185 



Met 
190 



Phe Leu Leu Glu Phe Leu Leu Pro Leu Gly He He Leu Phe Cys Ser 
195 200 



205 



Ala Arg He He Trp Ser Leu Arg Gin Arg Gin Met Asp Arg His Ala 

215 220 

Lys He Lys Arg Ala He Thr Phe He Met Val Val Ala He Val Phe 

230 235 240 

val He Cys Phe Leu Pro Ser Val Val Val Arg He Arg He Phe Trp 



245 250 255 



Leu Leu His Thr Ser Gly Thr Gin Asn Cys Glu Val Tyr Arg Ser Val 
2" 265 270 

Asp Leu Ala Phe Phe He Thr Leu Ser Phe Thr Tyr Met Asn Ser Met 

-^'S 280 



285 



Leu Asp Pro Val Val Tyr Tyr Phe Ser Ser Pro Ser Phe Pro Asn Phe 



295 300 
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Phe Ser Thr Leu lie Asn Arg 
305 310 

Pro Asp Asn Asn Arg Ser Thr 
325 

5 Lys Thr Arg Gly Ala Pro Glu 

340 

Trp Ser Pro Ser Tyr Leu Gly 
355 



87 

Cys Leu Gin Arg Lys Met Thr Gly Glu 
315 320 

Ser Val Glu Leu Thr Gly Asp Pro Asn 
330 335 

Ala Leu Met Ala Asn Ser Gly Glu Pro 
345 350 

Pro Thr Ser Asn Asn His Ser Lys Lys 
360 365 



Gly His Cys His Gin Glu Pro Ala Ser Leu Glu Lys Gin Leu Gly Cys 
10 370 375 380 

Cys lie Glu 
385 

(110) INFORMATION FOR SEQ ID N0:109: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
20 (iv) ANTI-SENSE: NO 

( xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 1 0 9 : 
ACCATGGCTT GCAATGGCAG TGCGGCCAGG GGGCACT 

(111) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
30 (iv) ANTI- SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
CGACCAGGAC AAACAGCATC TTGGTCACTT GTCTCCGGC 

(112) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 
(iv) ANTI -SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:111: 
GACCAAC3ATG CTGTTTGTCC TGGTCGTGGT GTTTGGCAT 
(113) INFORMATION FOR SEQ ID NO:112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
15 (iv) ANTI -SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 
CGGAATTCAG GATGGATCGG TCTCTTGCTG CGCCT 
(114) INFORMATION FOR SEQ ID NO: 113: 



PCT/US99/23938 



10 



20 



39 



35 



60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1212 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(^i) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

ATGGCTTGCA ATGGCAGTGC GGCCAGGGGG CACTTTGACC CTGAGGACTT GAACCTXSACT 
GACGAGGCAC TGAGACTCAA GTACCTGGGG CCCCAGCAGA CAGAGCTGTT CATGCCCATC 120 
TGTGCCACAT ACCTGCTGAT CTTCGTGGTG GGCGCTGTGG GCAATGGGCT GACCTGTCTG 
GTCATCCTGC GCCACAAGGC CATGCGCACG CCTACCAACT ACTACCTCTT CAGCCTGGCC 
30 GTGTCGGACC TGCTGGTGCT GCTGGTX5GGC CTGCCCCl^G AGCTCTATGA GATGTGGCAC 
AACTACCCCT TCCTGCIX3GG CGTTGGTOGC TGCTATTTCC GCACGCTACT GTITGAGATG 
GTCT^CdXSG CCTCAGTGCT CAACGIXIACT GCCCTCAGCG T^AACGCTA TGTGGCCGTG 
GTGCACCCAC TCCAGGCCAG GTCCATGGTG ACGCGGGCCC AIXSTGCGCCG AGTCCITGGG 



180 
240 
300 
360 
420 
480 
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25 



30 



GCCGTCTGGG 


GTCTTGCCAT 


GCTCTGCTCC CTGCCCAACA 


CCAGCCTGCA 


CGGCATCCGG 


540 


CAGCTGCACG 


TGCCCTGCCG 


GGGCCCAGTG CCAGACTCAG 


CTGTTTGCAT 


GCTGGTCCGC 


600 


CCACGGGCCC 


TCTACAACAT 


GGTAGTGCAG ACCACCGCGC 


TGCTCTTCTT 


CTGCCTGCCC 


660 


ATGGCCATCA 


TGAGCGTGCT 


CTACCTGCTC ATTGGGCTGC 


GACTGCGGCG 


GGAGAGGCTG 


720 


CTGCTCATGC 


A6GAGGCCAA 


GGGCAGGGGC TCTGCAGCAG 


CCAGGTCCAG 


ATACACCTGC 


780 


AGGCTCCAGC 


AGCACGATCG 


GGGCCGGAGA CAAGTGACCA AGATGCTGTT 


TGTCCTGGTC 


840 


GTGGTGTTTG 


GCATCTGCTG 


GGCCCCGTTC CACGCCGACC 


GCGTCATGTG 


GAGCGTCGTG 


900 


TCACAGTGGA 


CAGATGGCCT 


GCACCTGGCC TTCCAGCACG 


TGCACGTCAT 


CTCCGGCATC 


960 


TTCTTCTACC 


TGGGCTCGGC 


GGCCAACCCC GTGCTCTATA GCCTCATGTC 


CAGCCGCTTC 


1020 


CGAGAGACCT 


TCCAGGAGGC 


CCTGTGCCTC GGGGCCTGCT 


GCCATCGCCT 


CAGACCCCGC 


1080 


CACAGCTCCC 


ACAGCCTCAG 


CAGGATGACC ACAGGCAGCA 


GCCTGTGTGA 


TGTGGGCTCC 


1140 


CTGGGCAGCT 


GGGTCCACCC 


CCTGGCTGGG AACGATGGCC 


CAGAGGCGCA 


GCAAGAGACC 


1200 


GATCCATCCT 


GA 








1212 


(115) INFORMATION FOR 


SEQ ID NO: 114: 









15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 403 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

20 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

Met Ala Cys Asn Gly Ser Ala Ala Arg Gly His Phe Asp Pro Glu Asp 
1 5 10 15 

Leu Asn Leu Thr Asp Glu Ala Leu Arg Leu Lys Tyr Leu Gly Pro Gin 
20 25 30 

Gin Thr Glu Leu Phe Met Pro He Cys Ala Thr Tyr Leu Leu He Phe 
35 40 45 

Val Val Gly Ala Val Gly Asn Gly Leu Thr Cys Leu Val He Leu Arg 
50 55 60 

His Lys Ala Met Arg Thr Pro Thr Asn Tyr Tyr Leu Phe Ser Leu Ala 
^5 70 75 80 

Val Ser Asp Leu Leu Val Leu Leu Val Gly Leu Pro Leu Glu Leu Tyr 
85 90 95 
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Glu Met Trp His Asn Tyr Pro Phe Leu Leu Gly Val Gly Gly Cys lyr 



105 



110 

Phe Arg Thr Leu Leu Phe Glu Met Val Cys Leu Ala 



'"^^ ^^-^ cys Leu Ala Ser Val Leu Asn 

120 

val Thr Ala Leu Ser Val Glu Arg Tyr Val Ala Val Val His Pro Leu 

140 

Gin Ala Arg Ser Met Val Thr Arg Ala His Val Arg Arg Val Leu Gly 

"° 155 
Ala val Trp Gly Leu Ala Met Leu Cys Ser Leu Pro Asn Thr Ser Leu 

175 

His Gly lie Arg Gin Leu His Val Pro Cys Arg Gly Pro Val Pro Asp 

190 

ser Ala Val Cys Met Leu Val Arg Pro Arg Ala Leu Tyr Asn Met Val 



190 

Cys Met Leu Val Arg Pro Arg Ala Leu Tyr 
200 205 

val Gin Thr Thr Ala Leu Leu Phe Phe Cys Leu Pro Met Ala He Met 

^•^^ 220 
ser val Leu Tyr Leu Leu He Gly Leu Arg I.u Arg Arg Glu Arg Leu 



235 



240 



Leu Leu Met Gin Glu Ala Lys Gly Arg Gly Ser Ala Ala Ala Arg Ser 



250 



255 



Arg Tvr Thr Cys Arg Leu Gin Gin His Asp Arg Gly Arg Arg Gin 



265 



Val 



270 

Thr Lys Met Leu Phe Val Leu Val Val val Phe Gly He Cys Trp Ala 

280 285 

Pro Phe His Ala Asp Arg val Met Trp Ser Val Val Ser Gin Trp Thr 

295 200 

ASP Gly Leu His Leu Ala Phe Gin His Val His Val lie Ser Gly He 

310 

Phe Phe Tyr Leu Gly Ser Ala Ala Asn Pro Val Leu Tyr Ser Leu Met 

330 

ser ser Arg Phe Arg Glu Thr Phe Gin Glu Ala Leu Cys Leu Gly Ala 

350 

cys cys His Arg Leu Arg Pro Arg His Ser Ser His Ser Leu Ser Arg 

360 

Met Thr Thr Gly Ser Thr Leu Cys Asp Val Gly Ser Leu Gly Ser Trp 

380 

val His Pro Leu Ala Gly Asn Asp Gly Pro Glu Ala Gin Gin Glu Thr 
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385 390 395 400 

Asp Pro Ser 

(116) INFORMATION FOR SEQ ID NO: 115: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) . SEQUENCE DESCRIPTION: SEQ ID NO: 115: 
GGAAGCTTCA GGCCCAAAGA TGGGGAACAT 2C 

(117) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

GTGGATCCAC CCGCGGAGGA CCCAGGCTAG 30 

(118) INFORMATION TOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1098 base pairs 

25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

30 ATGGGQAACA TCACTGCAGA CAACTCCTCG ATGAGCTGTA CCATCGACCA TACCATCCAC 60 

CAGACGCTGG CCCCGGTGGT CTATGTTACC GTGCTGGTGG TGGGCTTCCC GGCCAACTGC 120 

CTGTCCCTCT ACTTCGGCTA CCTGCAGATC AAGGCCCGGA ACGAGCTGGG CGTGTACCTG 180 

TGCAACCTGA CGGTGGCCGA CCTCTTCTAC ATCTGCTCGC TGCCCTTCTG GCTGCAGTAC 240 



GTGCTGCAGC ACGACAACTG GTCTCACGGC GACCTGTCCT GCCAGGTGTG CGGCATCCTC 
35 CTGTACGAGA ACATCTACAT CAGCGTGGGC TTCCTCTGCT GCATCTCCGT GGACCGCTAC 



300 



360 
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CTCGCTGTGG CCCATCCCTT CCX3CTTCCAC CAGTTCCGGA CCCTGAAGGC GGCCGTCGGC 420 
GTCAGCGTGG TCATCTGGGC CAAGGAGCTG CTGACCA6CA TCTACTTCCT QATGCACGAG 480 
GAGGTCATCG AGGACGAGAA CCAGCACCGC GTGTGCTTTG AGCACTACCC CATCCAGGCA 540 
TGGCAGCGCG CCATCAACTA CTACCGCTTC CTGGTGGGCT TCCTCTTCCC CATCTCCCTC 600 
5 CTGCTGGCGT CCTACCAGGG CATCCTGCGC GCCGTGCGCC GGAGCCACGG CACCCAGAAG 660 
AGCCGCAAGG ACCAGATCCA QCGGCTGGTG CTCAGCACCG TGGTCATCTT CCTGGCCTGC 720 
TTCCTGCCCT ACCACGTGTT GCTGCTGGTG CGCAGCGTCT GGGAGGCCAG CTGCGACTTC 780 
GCCAAGGGCG TTTTCAACGC CTACCACTTC TCCCTCCTCC TCACCAGCTT CAACTCCGTC 840 
GCCGACCCCG TGCTCTACTG CTTCGTCAGC GAGACCACCC ACCGGGACCT 6GCCCGCCTC 900 
10 CGCGGGGCCT GCCTGGCCTT CCTCACCTGC TCCAGGACCG GCCGGGCCAG GGAGGCCTAC 960 
CCGCTGGGTG CCCCCGAGGC CTCCGGGAAA AGCGGGGCCC AGGGTGAGGA GCCCGAGCTG 1020 
TTGACCAAGC TCCACCCGGC CTTCCAGACC CCTAACTCGC CAGGGTCGGG CX3GGTTCCCC X080 
ACGG6CA6GT TGGCCTA6 

1098 

(119) INFORMATION FOR SEQ ID NO: 118: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 365 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

20 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Met Gly Asn He Thr Ala Asp Asn Ser Ser Met Ser Cys Thr He Asp 
^5 10 15 



25 



30 



His Thr He His Gin Thr Leu Ala Pro Val Val Tyr Val Thr Val Leu 
20 25 30 

Val Val Gly Phe Pro Ala Asn Cys Leu Ser Leu Tyr Phe Gly Tyr Leu 
35 40 45 

Gin He Lys Ala Arg Asn Glu Leu Gly Val Tyr Leu Cys Asn Leu Thr 
" 55 60 

Val Ala Asp Leu Phe Tyr He Cys Ser Leu Pro Phe Trp Leu Gin Tyr 
" ^0 75 80 

Val Leu Gin His Asp Asn Trp Ser His Gly Asp Leu Ser Cys Gin Val 
85 90 95 
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Cys Gly He Leu Leu Tyr Glu Asn He Tyr He Ser Val Gly Phe Leu 
100 105 110 

Cys Cys He Ser Val Asp Arg Tyr Leu Ala Val Ala His Pro Phe Arg 
115 120 125 

Phe His Gin Phe Arg Thr Leu Lys Ala Ala Val Gly Val Ser Val Val 
130 135 140 

He Trp Ala Lys Glu Leu Leu Thr Ser He Tyr Phe Leu Met His Glu 
145 150 155 160 

Glu Val He Glu Asp Glu Asn Gin His Arg Val Cys Phe Glu His Tyr 
1€5 170 175 

Pro He Gin Ala Trp Gin Arg Ala He Asn Tyr Tyr Arg Phe Leu Val 
180 185 190 

Gly Phe Leu Phe Pro He Cys Leu Leu Leu Ala Ser Tyr Gin Gly He 
195 200 205 

Leu Arg Ala Val Arg Arg Ser His Gly Thr Gin Lys Ser Arg Lys Asp 
210 215 220 

Gin He Gin Arg Leu Val Leu Ser Thr Val Val He Phe Leu Ala Cys 
225 230 235 240 

Phe Leu Pro Tyr His Val Leu Leu Leu Val Arg Ser Val Trp Glu Ala 
245. 250 255 

Ser Cys Asp Phe Ala Lys Gly Val Phe Asn Ala Tyr His Phe Ser Leu 
260 265 270 

Leu Leu Thr Ser Phe Asn Cys Val Ala Asp Pro Val Leu Tyr Cys Phe 
275 280 285 

Val Ser Glu Thr Thr His Arg Asp Leu Ala Arg Leu Arg Gly Ala Cys 
290 295 300 

Leu Ala Phe Leu Thr Cys Ser Arg Thr Gly Arg Ala Arg Glu Ala Tyr 
305 310 315 320 

Pro Leu Gly Ala Pro Glu Ala Ser Gly Lys Ser Gly Ala Gin Gly Glu 
325 330 335 

Glu Pro Glu Leu Leu Thr Lys Leu His Pro Ala Phe Gin Thr Pro Asn 
340 345 350 

Ser Pro Gly Ser Gly Gly Phe Pro Thr Gly Arg Leu Ala 
355 360 365 

(120) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 119: 

GACCTCGAGT CCTTCTACAC CTCATC 
(121) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 
15 TGCTCTAGAT TCCAGATAGG TGAAAACTTG 

(122) INFORMATION FOR SEQ ID N0:121: 
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20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1416 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



26 



30 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 
ATGGATATTC TTTGTGAAGA AAATACTTCT TTGAGCTCAA CTACGAACTC CCTAATGCAA 

25 TTAAATGATG ACAACAGGCT CTACAGTAAT GACTTTAACT CCGGAGAAGC TAACACTTCT 
GATGCATTTA ACTGGACAGT CGACTCTGAA AATCGAACCA ACCTTTCCTG TGAAGGGTGC 
CTCTCACCGT CGTGTCTCTC CTTACTTCAT CTCCAGGAAA AAAACTGGTC TGCTTTACTG 240 
ACAGCCGTAG TGATTATTCT AACTATTGCT GGAAACATAC TCGTCATCAT GGCAGTGTCC 
CTAGAGAAAA AGCTGCAGAA TGCCACCAAC TATTTCCTGA TGTCACTTGC CATAGCTGAT 

30 ATGCTGCTGG GTTTCCTTGT CATGCCCGTG TCCATGTTAA CCATCCTGTA TGGGTACCGG. 
TGGCCTCTGC CGAGCAAGCT TTGTGCAGTC TGGATTTACC TGGACGTGCT CTTCTCCACG 
GCCTCCATCA TGCACCTCTG CGCCATCTCG CTGGACCGCT ACGTCGCCAT CCAGAATCCC 540 
ATCCACCACA GCCGCTTCAA CTCCAGAACT AAGGCATTTC TGAAAATCAT TGCTGTTTGG 600 



60 
120 

180 



300 
360 
420 
480 
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ACCATATCAG TAGGTATATC CATGCCAATA CCAGTCTTTG GGCTACAGGA C6ATTCGAAG 660 

GTCTTTAAGG AGGGGAGTTG CTTACTCGCC GATGATAACT TTGTCCTGAT CGGCTCTTTT 720 

GTGTCATTTT TCATTCCCTT AACCATCATG GTGATCACCT ACTTTCTAAC TATCAAGTCA 780 

CTCCAGAAAG AAGCTACTTT GTGTGTAAGT GATCTTGGCA CACGGGCCAA ATTAGCTTCT 840 

TTCAGCTTCC TCCCTCA6AG TTCTTTGTCT TCAGAAAAGC TCTTCCAGCG GTCGATCCAT 900 

AGGGAGCCAG GGTCCTACAC AGGCAGGAGG ACTATGCAGT CCATCAGCAA TGAGCAAAAG 960 

GCATGCAAGG TGCTGGGCAT CGTCTTCTTC CTGTTTGTGG TGATGTGGTG CCCTTTCTTC 1020 

ATCACAAACA TCATGGCCGT CATCTGCAAA GAGTCCTGCA ATGAGGATGT CATTGGGGCC 1080 

CTGCTCAATG TGTTTGTTTG GATCGGTTAT CTCTCTTCAG CAGTCAACCC ACTAGTCTAC 1140 

ACACTGTTCA ACAAGACCTA TAGGTCAGCC TTTTCACGGT ATATTCAGTG TCAGTACAAG 1200 

GAAAACAAAA AACCATTGCA GTTAATTTTA GTGAACACAA TACCGGCTTT GGCCTACAAG 1260 

TCTAGCCAAC TTCAAATGGG ACAAAAAAAG AATTCAAAGC AAGATGCCAA GACAACAGAT 1320 

AATQACTGCT CAATGGTTGC TCTAGGAAAG CAGTATTCTG AAGAGGCTTC TAAAGACAAT 1380 

AGCGACGGAG TGAATGAT^ GQTGAGCTGT GTGTGA 1416 
(123) INFORMATION FOR SEQ ID NO : 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 471 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Met Asp He Leu Cys Glu Glu Asn Thr Ser Leu Ser Ser Thr Thr Asn 
15 10 15 

Ser Leu Met Gin Leu Asn Asp Asp Asn Arg Leu Tyr Ser Asn Asp Phe 
20 25 30 

Asn Ser Gly Glu Ala Asn Thr Ser Asp Ala Phe Asn Trp Thr Val Asp 
35 40 45 

Ser Glu Asn Arg Thr Asn Leu Ser Cys Glu Gly Cys Leu Ser Pro Ser 
50 55 60 

Cys Leu Ser Leu Leu His Leu Gin Glu Lys Asn Trp Ser Ala Leu Leu 
^5 70 75 80 
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Thr Ala val Val He lie Leu Thr He Ala Gly Asn He Leu Val lie 
85 90 35 



Met Ala Val Ser Leu Glu Lys Lys Leu Gin Asn Ala Thr Asn Tyr Phe 
100 105 

Leu Met ser Leu Ala He Ala Asp Met Leu Leu Gly Phe Leu Val Met 

120 125 

Pro val ser Met Leu Thr He Leu Tyr Gly Tyr Arg Trp Pro Leu Pro 

135 3^4Q 

ser Lys Leu Cys Ala Val Trp He Tyr Leu Asp Val Leu Phe Ser Thr 

ISO 155 

Ala Ser He Met His Leu Cys Ala He Ser Leu Asp Arg Tyr Val 



165 170 



Ala 
175 



He Gin Asn Pro He His His Ser Arg Phe Asn Ser Arg Thr Lys Ala 
"0 185 



Phe Leu Lys He He Ala Val Trp Thr He Ser Val Gly He Ser Met 

200 205 

Pro He Pro Val Phe Gly Leu Gin Asp Asp Ser Lys Val Phe Lys Glu 

215 220 

Gly ser Cys Leu Leu Ala Asp Asp Asn Phe Val Leu He Gly Ser Phe 



215 220 

Gly Ser Cys Leu Leu Ala Asp Asp Asn Phe Val Leu He Gly 

235 240 

val ser Phe Phe He Pro Leu Thr He Met Val He Thr Tyr Phe Leu 
245 250 • 255 



Thr He Lys Ser Leu Gin Lys Glu Ala Thr Leu Cys Val Ser Asp Leu 

Zb^ 270 

Gly Thr Arg Ala Lys Leu Ala Ser Phe Ser Phe Leu Pre Gin Ser Ser 



275 280 



285 



Leu ser Ser Glu Lys Leu Phe Gin Arg Ser He His Arg Glu Pro Gly 



300 



ser Tyr Thr Gly Arg Arg Thr Met Gin Ser He Ser Asn Glu Gin Lys 

315 320 

Ala Cys Lys Val Leu Gly He Val Phe Phe Leu Phe Val Val Met Trp 
325 330 335 

cys Pro Phe Phe He Thr Asn He Met Ala Val He Cys Lys Glu Ser 

345 350 

cys Asn Glu Asp Val He Gly Ala Leu Leu Asn Val Phe Val Trp He 

365 

Gly Tyr Leu Ser Ser Ala Val Asn Pro Leu Val l^r Thr Leu Phe Asn 
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370 375 380 

Lys Thr Tyr Arg Ser Ala Phe Ser Arg Tyr He Gin Cys Gin Tyr Lys 
385 390 395 400 

Glu Asn Lys Lys Pro Leu Gin Leu He Leu Val Asn Thr He Pro Ala 
405 410 415 

Leu Ala Tyr Lys Ser Ser Gin Leu Gin Met Gly Gin Lys Lys Asn Ser 
420 425 430 

Lys Gin Asp Ala Lys Thr Thr Asp Asn Asp Cys Ser Met Val Ala Leu 
435 440 445 

Gly Lys Gin Tyr Ser Glu Glu Ala Ser Lys Asp Asn Ser Asp Gly Val 
450 455 460 

Asn Glu Lys Val Ser Cys Val 
465 470 

(124) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
GACCTCGAGG TTGCTTAAGA CTGAAGC ; 

(125) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 
ATTTCTAGAC ATATGTAGCT TGTACCG 2 

(126) INFORMATION FOR SEQ ID NO:125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1377 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



^«««^2»29 PCr/US99/23938 

98 

(ii) MOLECULE TYPE: DMA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 5: 
ATGGTGAACC TGAGGAATGC GGTGCATTCA TTCCTTGTGC ACCTAATTGG CCTATTGGTT 60 
TGGCAATGTG ATATTTCTGT GAGCCCAGTA GCAGCTATAG TAACTGACAT TTTCAATACC 120 
TCCGATGGTC GACGCTTCAA ATTCCCAGAC GGGGTACAAA ACTGGCCAGC ACTTTCAATC 180 
GTCATCATAA TAATCATGAC AATAGGTGGC AACATCCTTG TGATCATGGC AGTAAGCATG 240 
GAAAAGAAAC TGCACAATGC CACCAATTAC TTCTTAATGT CCCTAGCCAT TGCTOATATG 300 
CTAGTG6GAC TACTTGTCAT GCCCCTGTCT CTCCTGGCAA TCCTTTATOA TTATGTCTGG 360 
CCACTACCTA GATATTTGTG CCCCGTCTGG ATTTCTTTAG ATGTTTTATT TTCAACAGCG 420 
TCCATCATGC ACCTCTGCGC TATATCGCTG GATCGGTATO TAGCAATACG TAATCCTATT 480 
GAGCATAGCC GTTTCAATTC GCGGACTAAG GCCATCATCA AGATTGCTAT TGTTTGGGCA 540 
ATTTCTATAG GTGTATCAGT TCCTATCCCT GTGATTGGAC TGAGGGACGA AGAAAAGGTC 600 
TTCGTGAACA ACACGACGTG CGTGCTCAAC GACCCAAATT TCGTTCTTAT TQQGTCCTTC 660 
GTAGCTTTCT TCATACCGCT GACGATTAT6 GTGATTACGT ATTGCCTCAC CATCTACGTT 720 
CTGCGCCGAC AAGCTTTGAT GTTACTGCAC GGCCACACCG AGGAACCGCC T6GACTAAGT 780 
CTGGATTTCC TGAAGTGCTG CAAGAGGAAT ACGGCCGAGG AAGAGAACTC TGCAAACCCT 840 
AACCAAGACC AGAACGCAOG CCGAAGAAAG AAGAAGGAGA GACGTCCTAG GGGCACCATG 900 
CAGGCTATCA ACAATGAAA6 AAAAGCTTCG AAAGTCCTTG GGATTCTTTT CTTTGTCTTT 960 
CTGATCATGT GGTGCCCATT TTTCArTACC AATATTCTGT CTGTTCTTTG TGAGAAGTCC 1020 
T6TAACCAAA AGCTCAK3GA AAAGCTTCTG AATQTGTTTO TTTGGATTGG CTATGTTTGT 1080 
TCAGGAATCA ATCCTCTGGT GTATACTCTG TTCAACAAAA TTTACCGAAG GGCATTCTCC 1140 
AACTATTTGC GTTQCAATTA TAAGGTAGAG AAAAAGCCTC CTGTCAGGCA GATTCCAAGA 1200 
GTTGCCGCCA CTGCTTTGTC T6GGAGGGAG CTTAATGTTA ACATTTATCG GCATACCAAT 1260. 
GAACCGGTGA TCGAGAAAGC CAGTGACAAT GAGCCCGGTA TAGAGATCCA AGTTGAGAAT 1320 
TTAGAGTTAC CAGTAAATCC CTCCAGTGTG GTTAGCGAAA GGATTAGCAG TGTGTGA 1377 
(127) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 458 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 
• (D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQtJENCE DESCRIPTION: SEQ ID N0:126: 

Met Val Asn Leu Arg Asn Ala Val His Ser Phe Leu Val His Leu He 
15 10 15 

Gly Leu Leu Val Trp Gin Cys Asp He Ser Val Ser Pro Val Ala Ala 
20 25 30 

He Val Thr Asp He Phe Asn Thr Ser Asp Gly Gly Arg Phe Lys Phe 
35 40 45 

Pro Asp Gly Val Gin Asn Trp Pro Ala Leu Ser He Val He He He 
50 55 .60 

He Met Thr He Gly Gly Asn He Leu Val He Met Ala Val Ser Met 
^5 70 75 80 

Glu Lys Lys Leu His Asn Ala Thr Asn Tyr Phe Leu Met Ser Leu Ala 
85 90 -95 

He Ala Asp Met Leu Val Gly Leu Leu Val Met Pro Leu Ser Leu Leu 
100 105 110 

Ala He Leu Tyr Asp Tyr Val Trp Pro Leu Pro Arg Tyr Leu Cys Pro 
115 120 125 

Val Trp He Ser Leu Asp Val Leu Phe Ser Thr Ala Ser He Met His 
130 135 140 

Leu Cys Ala He Ser Leu Asp Arg Tyr Val Ala He Arg Asn Pro He 
145 150 155 . 160 

Glu His Ser Arg Phe Asn Ser Arg Thr Lys Ala He Met Lys He Ala 
165 170 175 

He Val Trp Ala He Ser He Gly Val Ser Val Pro He Pro Val He 
180 185 190 

Gly Leu Arg Asp Glu Glu Lys Val Phe Val Asn Asn Thr Thr Cys Val 
195 200 205 

Leu Asn Asp Pro Asn Phe Val Leu He Gly Ser Phe Val Ala Phe Phe 
210 215 220 

He Pro Leu Thr He Met Val He Thr Tyr Cys Leu Thr He Tyr Val 
225 230 235 240 

Leu Arg Arg Gin Ala Leu Met Leu Leu His Gly His Thr Glu Glu Pro 
245 250 255 
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Pro Gly Leu Ser Leu Asp Phe Leu Lys Cys Cys Lys Arg Asn Thr Ala 
260 -265 270 

Glu Glu Glu Asn Ser Ala Asn Pro Asn Gin Asp Gin Asn Ala Arg Arg 
275 280 285 

Arg Lys Lys Lys Glu Arg Arg Pro Arg Gly Thr Met Gin Ala He Asn 

295 300 



Asn Glu Arg Lys Ala Ser Lys Val Leu Gly lie Val Phe Phe Val Phe 

320 



305 310 



Leu He Met Trp Cys Pro Phe Phe He Thr Asn He Leu Ser Val Leu 
325 330 335' 

cys Glu Lys Ser Cys Asn Gin Lys Leu Met Glu Lys Leu Leu Asn Val 
340 345 350 

Phe Val Trp He Gly Tyr Val Cys Ser Gly He Asn Pro Leu Val Tyr 
355 360 365 • 

Thr Leu Phe Asn Lys He Tyr Arg Arg Ala Phe Ser Asn Tyr Leu Arg 

375 380 

Cys Asn Tyr Lys Val Glu Lys Lys Pro Pro Val Arg Gin He" Pro Arg 

390 

Val Ala Ala Thr Ala Leu Ser Gly Arg Glu Leu Asn Val Asn He Tvr 

405 - - ^ 



410 



415 



Arg His Thr Asn Glu Pro Val He Glu Lys Ala Ser Asp Asn Glu Pro 
420 



425 



430 



Gly lie Glu Met Gin Val Glu Asn Leu Glu Leu Pro Val Asn Pro Ser 

445 

Ser Val Val Ser Glu Arg He Ser Ser Val 
450 455 

(128) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 
GGTAAGCTTG GCAGTCCACG CCAGGCCTTC 
(129) INFORMATION FOR SEQ ID NO: 128: 



30 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 
TCCGAATTCT CTGTAGACAC AAGGCTTTGG 30 
(130) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1068 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

ATGGATCAGT TCCCTGAATC AGTGACAGAA AACTTTGAGT ACGATGATTT GGCTGAGGCC 60 

TGTTATATTG GGGACATCGT GGTCTTTGGG ACTGTGTTCC TGTCCATATT CTACTCCGTC 120 

ATCTTTGCCA TTGGCCTGGT GGGAAATTTG TTGGTAGTGT TTGCCCTCAC CAACAGCAAG 180 

AAGCCCAAGA GTGTCACCGA CATTTACCTC CTGAACCTGG CCTTGTCTGA TCTGCTGTTT 240 

GTAGCCACTT TGCCCTTCTG GACTCACTAT TTGATAAATG AAAAGGGCCT CCACAATGCC 300 

ATGTGCAAAT TCACTACCGC CTTCTTCTTC ATCGGCTTTT TTGGAAGCAT ATTCTTCATC 360 

ACCGTCATCA GCATTGATAG GTACCTGGCC ATCGTCCTGG CCGCCAACTC CATGAACAAC 420 

CGGACCGTGC AGCATGGCGT CACCATCAGC CTAGGCGTCT GGGCAGCAGC CATTTTGGTG 480 

GCAGCACCCC AGTTCATGTT CACAAAGCAG AAAGAAAATG AATGCCTTGG TGACTACCCC 540 

GAGGTCCTCC AGGAAATCTG GCCCGTGCTC CGCAATGTGG AAACAAATTT TCTTGGCTTC 600 

CTACTCCCCC TGCTCATTAT GAGTTATTGC TACTTCAGAA TCATCCAGAC GCTGTTTTCC 660 

TGCAAGAACC ACAAG7VAAGC CAAAGCCATT AAACTGATCC TTCTGGTGGT CATCGTGTTT 720 

TTCCTCTTCT GGACACCCTA CAACGTTATG ATTTTCCTGG AGACGCTTAA GCTCTATGAC 780 

TTCTTTCCCA GTTGTGACAT GAGGAAGGAT CTGAGGCTGG CCCTCAGTGT GACTGAGACG 840 

GTTGCATTTA GCCATTGTTG CCTGAATCCT CTCATCTATG CATTTGCTGG GGAGAAGTTC 900 

AGAAGATACC TTTACCACCT GTATGGGAAA TGCCTGGCTG TCCTGTGTGG GCGCTCAGTC 960 
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CACGTTGATT TCTCCTCATC TGAATCACAA AGGAGCAGGC ATGGAAGTGT TCTGAGCAGC 1020 
AATTTTACTT ACCACACGAG TGATGGAGAT GCATTGCTCC TTCTCTGA io68 
(131) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 
Met Asp Gin Phe Pro Glu Ser Val Thr Glu Asn Phe Glu Tyr Asp Asp 

Leu Ala Glu Ala Cys Tyr He Gly Asp He Val Val Phe Gly Thr Val 
20 25 30 

Phe Leu Ser He Phe Tyr Ser Val He Phe Ala He Gly Leu Val Gly 
35 40 45 

Asn Leu Leu Val Val Phe Ala Leu Thr Asn Ser Lys Lys Pro Lys Ser 
50 55 60 

Val Thr Asp He Tyr Leu Leu Asn Leu Ala Leu Ser Asp Leu Leu Phe 



65 70 75 



80 



val Ala Thr Leu Pro Phe Trp Thr His Tyr Leu He Asn Glu Lys Glv 
85 90 3s 

•Leu His Asn Ala Met Cys Lys Phe Thr Thr Ala Phe Phe Phe He Gly 
100 105 no 

Phe Phe Gly Ser He Phe Phe He Thr Val He Ser He Asp Arg Tyr 
115 120 125 , 

Leu Ala He Val Leu Ala Ala Asn Ser Met Asn Asn Arg Thr Val Gin 
130 135 

His Gly val Thr He Ser Leu Gly Val Trp Ala Ala Ala He Leu Val 
^''^ 150 155 160 

Ala Ala Pro Gin Phe Met Phe Thr Lys Gin Lys Glu Asn Glu Cys Leu 
165 170 175 

Gly Asp Tyr Pro Glu Val Leu Gin Glu He Trp Pro Val Leu Arg Asn 
180 185 190 

Val Glu Thr Asn Phe Leu Gly Phe Leu Leu Pro Leu Leu He Met Ser 
195 200 205 
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Tyr Cys Tyr Phe Arg He He Gin Thr Leu Phe Ser Cys Lys Asn His 
210 215 220 

Lys Lys Ala Lys Ala He Lys Leu He Leu Leu Val Val He Val Phe 
225 230 235 240 

Phe Leu Phe Trp Thr Pro Tyr Asn Val Met He Phe Leu Glu Thr Leu 
245 250 255 

Lys Leu Tyr Asp Phe Phe Pro Ser Cys Asp Met Arg Lys Asp Leu Arg 
260 265 270 

Leu Ala Leu Ser Val Thr Glu Thr Val Ala Phe Ser His Cys Cys Leu 
275 280 285 

Asn Pro Leu He Tyr Ala Phe Ala Gly Glu Lys Phe Arg Arg Tyr Leu 
290 295 300 

Tyr His Leu Tyr Gly Lys Cys Leu Ala Val Leu Cys Gly Arg Ser Val 
305 310 315 320 

His Val Asp Phe Ser Ser Ser Glu Ser Gin Arg Ser Arg His Gly Ser 
325 330 335 

Val Leu Ser Ser Asn Phe Thr Tyr His Thr Ser Asp Gly Asp Ala Leu 
340 345 350 

Leu Leu Leu 
355 

(132) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 
GATCTCCAGT AGGCATAAGT GGACAATTCT GG 32 

(133) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 
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CTCCTTCGGT CCTCCTATCG TTGTCAGAAG 

30 

(134) INFORMATION FOR SEQ ID NO:133: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 base pairs 

5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 
10 AGAAGGCCAA GATCGCGCGG CTGGCCCTCA 

30 

(135) INFORMATION FOR SEQ ID N0:134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

CGGCGCCACC GCACGAAAAA GCTCATCTTC 
(136) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 
GCCAAGAAGC GGGTGAAGTT CCTGGTGGTG GCA 
(13 7) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 
CAGGCGGAAG GTGAAAGTCC TGGTCCTCGT 

(138) INFORMATION FOR SEQ ID NO: 13 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 
CGGCGCCTGC GGGCCAAGCG GCTGGTGGTG GTG 

(139) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 8: 
CCAAGCACAA AGCCAAGAAA GTGACCATCA C 

(140) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 9: 
GCGCCGGCGC ACCAAATGCT TGCTGGTGGT 

(141) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 
CAAAAAGCTG AAGAAATCTA AGAAGATCAT CTTTATTGTC G 
(142) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 
CAAGACCAAG GCAAAACGCA TGATCGCCAT 

30 

(143) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 
GTCAAGGAGA AGTCCAAAAG GATCATCATC 

30 

(144) INFORMATION FOR SEQ ID NO: 14 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:143: 
CGCCGCGTGC GGGCCAAGCA GCTCCTGCTC 

30 

(145) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHT^CTERISTICS : 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 
CCTGATAAGC GCTATAAAAT GGTCCTGTTT CGA 

(146) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDl^ESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 
GAAAGA'CAAA AGAGAGTCAA GAGGATGTCT TTATTG 

(147) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHTVRACTERISTICS : 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 
CGGAGT^GA GGGTGAAACG CACAGCCATC GCC 

(148) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid * 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 
AAGCTTCAGC GGGCCAAGGC ACTGGTCACC 

(149) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:148: 
CAGCGGCAGA AGGCIAAAAG GGTGGCCATC 
(150) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 
CGGCAGAAGG CGAAGCGCAT GATCCTCGCG 

(151) INFORMATION FOR SEQ ID NO:150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 
GAGCGCAACA AGGCCAAAAA GGTGATCATC 

(152) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 
GGTGTAAACA AAAAGGCTAA AAACACAATT ATTCTTATT 

(153) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 
GAGAGCCAGC TCAAGAGCAC CGTGGTG 

(154) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 
CCACAAGCAA ACCAAGAAAA TGCTGGCTGT 

(155) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 
CATCAAGTGT ATCATGTGCC AAGTACGCCC 

(156) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 
CTAGAGAGTC AGATGAAGTG TACAGTAGTG GCAC 

(157) INFORMATION FOR SEQ ID NO:156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 
CGGACAAAAG TGAAAACTAA AAAGATGTTC CTCATT 
(158) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

GCTGAGGTTC GCAATAAACT AACCATGTTT GTG 
(159) INFORMATION FOR SEQ ID NO: 158: 

(i) S.EQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

GGGAGGCCGA GCTGAAAGCC ACCCTGCTC 
(160) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 31 base pairs 

^ (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 
CAAGATCAAG AGAGCCAAAA CCTTCATCAT G 
(161) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



[) 



36 



33 



29 



31 
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(ii) MOLECXJLE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 
CCGGAGACAA GTGAAGJ\AGA TGCTGTTTGT C 3 

(162) INFORMATION FOR SEQ ID N0:161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 
GCAAGGACCA GATCAAGCGG CTGGTGCTCA 3^ 

(163) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 
CAAGAAAGCC AAAGCCAAGA AACTGATCCT TCTG 34 

(164) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1068 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:163: 

ATGGAAGATT TGGAGGAAAC ATTATTTGAA GAATTTGAAA ACTATTCCTA TGACCTAGAC 60 

TATTACTCTC TGGAGTCTGA TTTGGAGGAG AAAGTCCAGC TGGGAGTTGT TCACTGGGTC 12 0 

TCCCTGGTGT TATATTGTTT GGCTTTTGTT CTGGGAATTC CAGGAAATGC CATCGTCATT 180 

TGGTTCACGG GGCTCAAGTG GAAGAAGACA GTCACCACTC TGTGGTTCCT CAATCTAGCC 240 



ATTGCGGATT TCATTTTTCT TCTCTTTCTG CCCCTGTACA TCTCCTATGT GGCCATGAAT 300 
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rrCCACTGGC CCTTTGGCAT C1X3GCTGTGC AAAGCCAATT CCTTCACTGC CCAGTTGAAC 360 
ATGnTGCCA GTCTTTTTTT CCTCACAGTO ATCAGCCTGG ACCACTATAT CCACTTGATC 420 
CATCCTGTCT TATCTCATCG GCATCGAACC CTCAAGAACT CTCTGATTGT CATTATATTC 480 
ATCTGGCTTT TGGCTTCTCT AATTGGCGGT CCTGCCCTGT ACTTCCGGGA CACTGTGGAG 540 
TTCAATAATC ATACTCTTTG CTATAACAAT rTTCAGAAGC ATCATCCTGA CCTCACI^G 600 
ATCAGGCACC ATGTTCTGAC TTGGGIXSAAA TTTATCATTG GCTATCTCTT CCCTTTGCTA 660 
ACAATGAGTA 1TTGCTAC7T GTGTCTCATC TTCAAGGTGA AGAAGCGAAC AGTCCIX3ATC 720 
TCCAGTAGGC ATAAGTGGAC AAITCTGGrr GTGGTTGTOG CCTTO^STGGT TTGCTGGACT 780 
CCTTATCACC TGTTrAGCAT TTGGGAGCTC ACCA^TCACC ACAATAGCTA TTCCCACCAT 840 
GTGATGCAGG CTGGAATCCC CCTCTCCACT GGT^GCAT TCCTCAATAG TTGCTTGAAC 900 
CCCATCCTTT ATGTCCTAAT TAGTAAGAAG TTCCAAGCTC GCTTCCGGTC CTCAGTTGCT 960 
GAGATACTCA AGTACACACT GTGGGAAGTC AGCTGTTCTG GCACAG1X3AG TGAACAGCTC 1020 
AGGAACTCAG AAACCAAGAA TCTGTGTCTC CTCGAAACAG CTCAATAA io68 
(165) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRT^EDNESS: 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 

Met Glu Asp Leu Glu Glu Thr Leu Phe Glu Glu Phe Glu Asn Tyr Ser 
5 10 

Tyr Asp Leu Asp Tyr Tyr Ser Leu Glu Ser Asp Leu Glu Glu Lys Val 
^° 25 30 

Gin Leu Gly Val Val His Trp Val Ser Leu Val Leu Tyr Cys Leu Ala 
35 40 45 

Phe val Leu Gly He Pro Gly Asn Ala He Val He Trp Phe Thr Gly 

60 

Leu Lys Trp Lys Lys Thr Val Thr Thr Leu Trp Phe Leu Asn Leu Ala 



'5 80 



He Ala ASP Phe He Phe Leu Leu Phe Leu Pro Leu Tyr He Ser Tyr 
85 90 35 
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Val Ala Met Asn 

100 

Asn Ser Phe Thr 
115 

Thr Val lie Ser 
130 

Ser His Arg His 
145 



113 

Phe His Trp Pro Phe Gly 
105 

Ala Gin Leu Asn Met Phe 
120 

Leu Asp His Tyr lie His 
'135 

Arg Thr Leu Lys Asn Ser 
150 



lie Trp Leu Cys Lys Ala 
110 

Ala Ser Val Phe Phe Leu 
125 

Leu lie His Pro Val Leu 
140 

Leu He Val He He Phe 
155 160 

Tyr Phe Arg 
175 

Cys Tyr Asn Asn Phe Gin 
190 

His His Val Leu Thr Trp. 
205 

Leu Leu Thr Met Ser He 
220 

Lys Arg Thr Val Leu He 
235 240 

Val Val Val Ala Phe Val 
255 



He Trp Leu Leu. Ala Ser Leu He Gly Gly Pro Ala Leu 

165 . 170 

Asp Thr Val Glu Phe Asn Asn His Thr Leu 

180 185 

Lys His Asp Pro Asp Leu Thr Leu He Arg 

195 200 

Val Lys Phe He He Gly Tyr^Leu Phe Pro 

210 215 

Cys Tyr Leu Cys Leu He Phe Lys Val Lys 

225 230 

Ser Ser Arg His Lys Trp Thr He Leu Val 

245 • 250 



- Thr Ala Gin 
355 

(166) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1089 base pairs 



Val Cys Trp Thr Pro Tyr His Leu Phe Ser He Trp Glu Leu Thr He 
260 265 270 

His His Asn Ser Tyr Ser His His Val Met Gin Ala Gly He Pro Leu 
275 280 285 

Ser Thr Gly Leu Ala Phe Leu Asn Ser Cys Leu Asn Pro He Leu Tyr 
290 295 300 

Val Leu He Ser Lys Lys Phe Gin Ala Arg Phe Arg Ser Ser Val Ala 
305 310 315 320 

Glu He Leu Lys Tyr Thr Leu Trp Glu Val Ser Cys Ser Gly Thr Val 
325 330 335 

Ser Glu Gin Leu Arg Asn Ser Glu Thr Lys Asn Leu Cys Leu Leu Glu 
340 345 350 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECOLE TYPE: DNA (genomic) 
5 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

ATGGGCAACC ACACGTCGGA GGGCTGCCAC GTGGACTCGC GCGI^GACCA CCTCTTTCCG 60 
CCATCCCTCT ACATCT^GT CATCGGCGTG GGGCTGCCCA CCAACTGCCT GGCTCTGTGG 120 

GCGGCCTACC GCCAGGTGCA ACAGCGCAAC GAGCTGGGCG TCTACCTGAT GAACCTCAGC 180 

ATCGCCGACC TGCTGTACAT CTGCACGCTG CCGCTGTGGG TGGACTACTT CCTGCACCAC 240 

10 GACAACTGGA TCCACGGCCC CGGGTCCTGC AAGCTCTTTG GGTTCATCTT CTACACCAAT . 300 

ATCTACATCA GCATCGCCTT CCTGTGCTGC ATCTCGGTGG ACCGCTACCT GGCTGTGGCC 360 

CACCCACTCC GCrrCGCCCG CCTGCGCCGC GTCAAGACCG CCGTGGCCGT GAGCTCCGTG 420 

GTCTGGGCCA CGGAGCTGGG CGCCAACTCG GCGCCCCTGT TCCATGACGA GCTCrPCCGA 480 

GACCGCTACA ACCACACCTT CTGCTTTGAG AAGTTCCCCA TGGAAGGCTG GGTGGCCTGG 540 

15 ATGAACCTCT ATCGGGTX3TT CGTGGGCTTC CTCTTCCCGT GGGCGCTCAT GCTGCTGTCG 600 

TACCGGGGCA TCCTGCGGGC CGTGCGGGGC AGCGTGTCCA CCGAGCGCCA GGAGAAGGCC 660 

AAGATCGCGC GGCTGGCCCT CAGCCTCATC GCCATCGTCC 1X3GTCTGCTT -raCGCCCTAT 720 

CACGTGCTCT TGCTGTCCCG CAGCGCCATC TACCTGGGCC GCCCCTGGGA CTGCGGCTTC 780 

GAGGAGCGCG TCTTT^CTGC ATACCACAGC TCACTGGCTT TCACCAGCCT CAACTGTGTG 840 

20 GCGGACCCCA TCCTCTACTG CCTGGTCAAC GAGGGCGCCC GCAGCGATGT GGCCAAGGCC SCO 

CTGCACAACC TGCTCCGCTT TCTGGCCAGC GACAAGCCCC AGGAGATX.GC CAATGCCTCG 960 
CTCACCC^ AGACCCCACT CACCTCCAAG AGGAACAGCA CAGCCAAAGC CATGACTGGC 1020 
AGCTGGGCGG CCACTCCGCC TTCCCAGGGG GACCAGGTGC AGCTGAAGAT GCTGCCGCCA 1080 
GCACAATGA 

1089 

25 (167) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 362 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein ' 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

Met Gly Asn His Thr Trp Glu Gly Cys His Val Asp Ser Arg Val Asp 
15 10 15 

His Leu Phe Pro Pro Ser Leu Tyr lie Phe Val lie Gly Val Gly Leu 
20 25 , 30 

Pro Thr Asn Cys Leu Ala Leu Trp Ala Ala Tyr Arg Gin Val Gin Gin 
35 40 45 

Arg Asn Glu Leu Gly Val Tyr Leu Met Asn Leu Ser lie Ala Asp Leu 
50 55 • 60 

Leu Tyr lie Cys Thr Leu Pro Leu Trp Val Asp Tyr Phe Leu His His 
65 70 75 80 

Asp Asn Trp lie His Gly Pro Gly Ser Cys Lys Leu Phe Gly Phe lie 
85 90 95 . 

Phe Tyr Thr Asn lie Tyr He Ser He Ala Phe Leu Cys Cys He Ser 
100 105 110 

Val Asp Arg Tyr Leu Ala Val Ala His Pro Leu Arg Phe Ala Arg Leu 
115 120 125 

Arg Arg Val Lys Thr Ala Val Ala Val Ser Ser Val Val Trp Ala Thr 

130 135 140 

Glu Leu Gly Ala Asn Ser Ala Pro Leu Phe His Asp Glu Leu Phe Arg 
145 150 155 160 

Asp Arg Tyr Asn His Thr Phe Cys Phe Glu Lys Phe Pro Met Glu Gly 
165 170 175 

Trp Val Ala Trp Met Asn Leu Tyr Arg Val Phe Val Gly Phe Leu Phe 
180 185 190 

Pro Trp Ala Leu Met Leu Leu Ser Tyr Arg Gly He Leu Arg Ala Val 
195 200 205 

Arg Gly Ser Val Ser Thr Glu Arg Gin Glu Lys Ala Lys He Ala Arg 
210 215 220 

Leu Ala Leu Ser Leu He Ala He Val Leu Val Cys Phe Ala Pro Tyr 
225 230 235 240 

His Val Leu Leu Leu Ser Arg Ser Ala He Tyr Leu Gly Arg Pro Trp 
245 250 255 

Asp Cys Gly Phe Glu Glu Arg Val Phe Ser Ala Tyr His Ser Ser Leu 
260 265 270 



Ala Phe Thr Ser Leu Asn Cys Val Ala Asp Pro He Leu Tyr Cys Leu 
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2S5 

Asp Val Ala Lys Ala 

295 



val Asn Glu Gly Ala Arg Ser Asp Val Ala Lys Ala Leu His Asn Leu 

295 200 

Leu Arg Phe Leu Ala Ser Asp Lys Pro Gin Glu Met Ala Asn Ala Ser 

315 320 

Leu Thr Leu Glu Thr Pro Leu Thr Ser Lys Arg Asn Ser Thr Ala Lys 
325 330 

Ala Met Thr Gly Ser Trp Ala Ala Thr Pro Pro Ser Gin Gly Asp Gin 



350 



10 



15 



Val Gin Leu Lys Met Leu Pro Pro Ala Gin 
355 360 

(168) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 1002 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 
20 ATGGAGTCCT CAGGCAACCC AGAGAGCACC ACCTTTTTTT ACTATGACCT TCAGAGCCAG 60 
CCGT6TGAGA ACCAGGCCTG GGTCTTTGCT ACCCTCGCCA CCACTGTCCT GTACTGCCTG 120 
GTGTTTCTCC TCAGCCTAGT GGGCAACAGC CTGGTCCTGT GGGTCCTGGT GAAGTATGAG 180 
AGCCTGGAGT CCCTCACCAA CATCTTCATC CTCAACCTGT GCCTCTCAGA CCTGGT^rrC 
GCCTGCTTGT TGCCTGTGTG GATCTCCCCA TACCACI^, GCTGGGTGCT GGGAGACTTC 
25 CTCTGCAAAC TCCTCAATAT GATCTTCTCC ATCAGCCTCT ACAGOVGCAT CTTCTTCCT^ 

ACCATCATGA CCATCCACCG CTACCTGTCG GTAGTGAGCC CCCTCTCCAC CCTGCGCGTC 420 
CCCACCCTCC GCTGCCGGGT GCTGGTGACC ATGGCTGIX3T GGGTAGCCAG CATCCTOTCC 480 
TCCATCCTCG ACACCATCTT CCACAAGGTG CTTTCTTCGG GCTOTCATTA TTCCGAACTC 
ACGTGGTACC TCACCTCCGT CTACCAGCAC AACCTCTTCT TCCTGCTGTC CCTX3GGGATT 
30 ATCCTGITCT GCTACGTGGA GATCCTCAGG ACCCTGTTCC GCTCACGCTC CAAGCGGCGC 660 
CACCGCACGA AAAAGCTCAT CITCGCCATC GTCGTGGCCT ACTTCCTCAG CK3GGGTCCC 720 
TACAACrrCA CCCTGnrCT GCAGACGCTG TTTCGGACCC AGATCATCCG GAGCTGCGAG 780 



240 
300 
360 



540 
600 
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GCCAAACAGC AGCTAGAATA CGCCCTGCTC ATCTGCCGCA ACCTCGCCTT CTCCCACTGC 840 

TGCTTTAACC CGGTGCTCTA TGTCTTCGTG GGGGTCAAGT TCCGCACACA CCTGAAACAT 900 

GTTCTCCGGC AGTTCTGGTT CTGCCGGCTG CAGGCACCCA GCCCAGCCTC GATCCCCCAC 960 

TCCCCTGGTG CCTTCGCCTA TGAGGGCGCC TCCTTCTACT GA • 1002 
(169) INFORMATION FOR SEQ ID N0:168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

Met Glu Ser Ser Gly Asn Pro Glu Ser Thr Thr Phe Phe Tyr Tyr Asp 
1 5 10 15 

Leu Gin Ser Gin Pro Cys Glu Asn Gin Ala Trp Val Phe Ala Thr Leu 
20 25 ■ 30 

Ala Thr Thr Val Leu Tyr Cys Leu Val Phe Leu Leu Ser Leu Val Gly 
35 40 45 

Asn Ser Leu Val Leu Trp Val Leu Val Lys Tyr Glu Ser Leu Glu Ser 
50 55 60 

Leu Thr Asn lie Phe lie Leu Asn Leu Cys Leu Ser Asp Leu Val Phe 
65 70 75 80 

Ala Cys Leu Leu Pro Val Trp lie Ser Pro Tyr His Trp Gly Trp Val 
85 90 95 

Leu Gly Asp Phe Leu Cys Lys Leu Leu Asn Met lie Phe Ser lie Ser 
100 105 110 

Leu Tyr Ser Ser lie Phe Phe Leu Thr He Met Thr He His Arg Tyr 
115 120 125 

Leu Ser Val Val Ser Pro Leu Ser Thr Leu Arg Val Pro Thr Leu Arg 
130 135 140 

Cys Arg Val Leu Val Thr Met Ala Val Trp Val Ala Ser He Leu Ser 
145 150 155 160 

Ser He Leu Asp Thr He Phe His Lys Val Leu Ser Ser Gly Cys Asp 
165 170 175 

Tyr Ser Glu Leu Thr Trp Tyr Leu Thr Ser Val Tyr Gin His Asn Leu 
180 185 190 
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Phe Phe Leu Leu Ser Leu Gly lie lie Leu Phe Cys Tyr Val Glu lie 
195 200 205 

Leu Arg Thr Leu Phe Arg Ser Arg Ser Lys Arg Arg His Arg Thr Lys 
210 215 220 

Lys Leu lie Phe Ala lie Val Val Ala Tyr Phe Leu Ser Trp Gly Pro 

230 235 240 

Tyr Asn Phe Thr Leu Phe Leu Gin Thr Leu Phe Arg Thr Gin lie He 
245 250 255 

Arg Ser Cys Glu Ala Lys Gin Gin Leu Glu Tyr Ala Leu Leu He Cys 
260 265 270 

Arg Asn Leu Ala Phe Ser His Cys Cys Phe Asn Pro Val Leu Tyr Val 
275 280 285 

Phe Val Gly Val Lys Phe Arg Thr His Leu Lys His Val Leu Arg Gin 
290 295 300 

Phe Trp Phe Cys Arg Leu Gin Ala Pro Ser Pro Ala Ser He Pro His 

310 315 .320 

Ser Pro Gly Ala Phe Ala Tyr Glu Gly Ala Ser Phe Tyr 
325 330 

(170) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 987 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

ATGGACAACG CCTCGTTCTC GGAGCCCTGG CCCGCCAACG CATCGGGCCC GGACCCGGCG 60 

CTGAGCTGCT CCAACGCGTC GACTCTGGCG CCGCTGCCGG CGCCGCTGGC GGTGGCTGTA 120 

CCAGTTGTCT ACGCGGTGAT CTGCGCCGTG GGTCTGGCGG GCAACTCCGC CGTGCTGTAC 180 

GTGTTGCTGC GGGCGCCCCG CATGAAGACC GTCACCAACC TGTTCATCCT CAACCTGGCC 240 

ATCGCCGACG AGCTCTTCAC GCTGGTGCTG CCCATCAACA TCGCCGACTT CCTGCTGCGG 300 

CAGTGGCCCT TCGGGGAGCT CATGTGCAAG CTCATCGTGG CTATCGACCA GTACAACACC 360 

TTCTCCAGCC TCTACTTCCT CACCGTCATG AGCGCCGACC GCTACCTGGT GGTGTTGGCC 420 

ACTQCGGAGT CQC6CCGGGT G6CCGGCCGC ACCTACAGCG CCGCGCGCGC GGTGAGCCTG 480 
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GCCGTGTGGG GGATCGTCAC ACTCGTCGTG CTGCCCTTCG CAGTCTTCGC CCGGCTAGAC 540 

GACGAGCAGG GCCGGCGCCA GTGCGTGCTA GTCTTTCCGC AGCCCGAGGC CTTCTGGTGG 600 

CGCGCGAGCC GCCTCTACAC GCTCGTGCTG GGCTTCGCCA TCCCCGTGTC CACCATCTGT .660 

GTCCTCTATA CCACCCTGCT GTGCCGGCTG CATGCCATGC GGCTGGACAG CCACGCCAAG 720 

GCCCTGGAGC GCGCCAAGAA GCGGGTGAAG TTCCTGGTGG TGGCAATCCT GGCGGTGTGC 780 

CTCCTCTGCT GGACGCCCTA CCACCTGAGC ACCGTGGTGG CGCTCTICCAC CGACCTCCCG 840 

CAGACGCCGC TGGTCATCGC TATCTCCTAC TTCATCACCA GCCTGACGTA CGCCAACAGC 900 

TGCCTCAACC CCTTCCTCTA CGCCTTCCTG GACGCCAGCT TCCGCAGGAA CCTCCGCCAG 960 

CTGATAACTT GCCGCGCGGC AGCCTGA 987 
(171) INFORMATION FOR SEQ ID NO:170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 328 amino acids 

(B) TYPE: amino acid 

(C) STRANBEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

Met Asp Asn Ala Ser Phe Ser Glu Pro Trp Pro Ala Asn Ala Ser Gly 
IS 10 15 

Pro Asp Pro Ala Leu Ser Cys Ser Asn Ala Ser Thr Leu Ala Pro Leu 

20 25 30 

Pro Ala Pro Leu Ala Val Ala Val Pro Val Val Tyr Ala Val lie Cys 
35 40 45 

Ala Val Gly Leu Ala Gly Asn Ser Ala Val Leu Tyr Val Leu Leu Arg 
.50 55 60 

Ala Pro Arg Met Lys Thr Val Thr Asn Leu Phe lie Leu Asn Leu Ala 
65 70 75 80 

He Ala Asp Glu Leu Phe Thr Leu Val Leu Pro He Asn He Ala Asp 
85 90 95 

Phe Leu Leu Arg Gin Trp Pro Phe Gly Glu Leu Met Cys Lys Leu He 
100 105 110 

Val Ala He Asp Gin Tyr Asn Thr Phe Ser Ser Leu Tyr Phe Leu Thr 
115 120 125 

Val Met Ser Ala Asp Arg Tyr Leu Val Val Leu Ala Thr Ala Glu Ser 
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Arg Arg Val Ala Gly Arg Thr Tyr Ser Ala Ala Arg Ala Val Ser Leu 



10 



15 



20 



25 



155 



160 

Ala val Trp Gly He Val Thr Leu Val Val Leu Pro Phe Ala Val Phe 

165 

Ala Arg Leu Asp Asp Glu Gin Gly Arg Arg Gin Cys Val 



170 



180 



185 



Leu Val Phe 
190 

pro Gin Pro Glu Ala Phe Trp Trp Arg Ala Ser Arg Leu Tyr Thr Leu 

200 205 

val Leu Gly Phe Ala He Pro Val Ser Thr He Cys Val Leu Tyr Thr 

220 

Thr Leu Leu Cys Arg Leu His Ala Met Arg Leu Asp Ser His Ala Lys 

"° 235 240 

Ala Leu Glu Arg Ala Lys Lys Arg Val Lys Phe Leu Val Val Ala He 
245 250 255 

Leu Ala Val Cys Leu Leu Cys Trp Thr Pro Tyr His Leu Ser Thr Val 
260 265 270 

Val Ala Leu Thr Thr Asp Leu Pro Gin Thr Pro Leu Val He Ala He 
275 280 285 

ser Tyr Phe He Thr Ser Leu Thr Tyr Ala Asn Ser Cys Leu Asn Pro 
Phe Leu Tyr Ala Phe Leu Asp Ala Ser Phe Arg Arg Asn Leu Arg Gin 



310 

Leu He Thr Cys Arg Ala Ala Ala 
325 

(172) INFORMATION FOR SEQ ID NO: 171: 



320 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1002 base pairs 
^B) TYPE: nucleic acid 

(C:) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:171: 
ATGCAGGCCG CTGGGCACCC AGAGCCCCIT GACAGCAGGG GCTCCTTCTC CCTCCCCACG 60 

35 A^TGCCA ACGTCTCTCA GGACAATGGC ACTGGCCACA ATGCCACC.T CTCCGAGCCA 120 
CTGCCGTTCC TCTAIX3TGCT CCTGCCCGCC GTGTACTCCG GGATCTCTGC TGTOGGGCTG 180 
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ACTGGCAACA 


CGGCCGTCAT 


CCTTGTAATC 


CTAAGGGCGC 


CCAAGATGAA 


GACGGTGACC 


240 


AACGTGTTCA 


TCCTGAACCT 


GGCCGTCGCC 


GACGGGCTCT 


TCACGCTGGT 


ACTGCCTGTC 


300 


AACATCGCGG 


AGCACCTGCT 


GCAGTACTGG 


CCCTTCGGGG 


AGCTGCTCTG 


CAAGCTGGTG 


3C0 


CTGGCCGTCG 


ACCACTACAA 


CATCTTCTCC 


AGCATCTACT 


TCCTAGCCGT 


GATGAGCGTG 


420 


GACCGATACC 


TGGTGGTGCT 


GGCCACCGTG 


AGGTCCCGCC 


ACATGCCCTG 


GCGCACCTAC 


480 


CGGGGGGCGA 


AGGTCGCCAG 


CCTGTGTGTC 


TGGCTGGGCG 


TCACGGTCCT 


GGTTCTGCCC 


540 


TTCTTCTCTT 


TCGCTGGCGT 


CTACAGCAAC 


GAGCTGCAGG 


TCCCAAGCTG 


TGGGCTGAGC 


600 


TTCCCGTGGC 


CCGAGCAGGT 


CTGGTTCAAG 


GCCAGCCGTG 


TCTACACGTT 


GGTCCTGGGC 


660 


TTCGTGCTGC 


CCGTGTGCAC 


CATCTGTGTG 


CTCTACACAG 


ACCTCCTGCG 


CAGGCTGCGG 


720 


GCCGTGCGGC 


TCCGCTCTGG 


AGCC7UVGGCT 


CTAGGC7VAGG 


CCAGGCGGAA 


GGTGAAAGTC 


780 


CTGGTCCTCG 


TCGTGCTGGC 


CGTGTGCCTC 


CTCTGCTGGA 


CGCCCTTCCA 


CCTGGCCTCT 


840 


GTCGTGGCCC 


TGACCACGGA 


CCTGCCCCAG 


ACCCCACTGG 


TCATCAGTAT 


GTCCTACGTC 


900 


ATCACCAGCC 


TCACGTACGC 


CAACTCGTGC 


CTGAACCCCT 


TCCTCTACGC 


CTTTCTAGAT 


960 


GACAACTTCC 


GGAAGAACTT 


CCGCAGCATA 


TTGCGGTGCT 


GA 




1002 


(173) INFORMATION FOR 


SEQ ID NO: 172: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

Met Gin Ala Ala Gly His Pro Glu Pro Leu Asp Ser Arg Gly Ser Phe 
1 5 • 10 15 

Ser Leu Pro Thr Met Gly Ala Asn Val Ser Gin Asp Asn Gly Thr Gly 
20 25 30 

His Asn Ala Thr Phe Ser Glu Pro Leu Pro Phe Leu Tyr Val Leu Leu 
35 40 45 

Pro Ala Val Tyr Ser Gly He Cys Ala Val Gly Leu Thr Gly Asn Thr 
50 55 60 

Ala Val He Leu Val He Leu Arg Ala Pro Lys Met Lys Thr Val Thr 
^5 70 .75 80 
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Asn val Phe He Leu Asn Leu Ala Val Ala Asp Gly Leu Phe Thr Leu 
85 90 95 

val Leu Pro Val Asn lie Ala Glu His Leu Leu Gin Tyr Trp Pro Phe 



105 



110 



Gly Glu Leu Leu Cys Lys Leu Val Leu Ala Val Asp His Tyr Asn lie 



120 



125 



Phe Ser Ser He Tyr Phe Leu Ala Val Met Ser Val Asp Arg Tyr Leu 



140 



0 145 



val val Leu Ala Thr Val Arg Ser Arg His Met Pro Trp Arg Thr Tyr 



150 



160 



Arg Qly Ala Lys Val Ala Ser Leu Cys Val Trp Leu Gly Val Thr Val 

165 



170 



175 



Leu val Leu Pro Phe Phe Ser Phe Ala Gly Val Tyr Ser Asn Glu Leu 



180 185 



190 



Gin Val Pro Ser Cys Gly Leu Ser Phe Pro Trp Pro Glu Gin Val Trp 
195 200 205 

Phe Lys Ala Ser Arg Val Tyr Thr Leu Val Leu Gly Phe Val Leu Pro 

2X5 220 

val cys Thr He Cys Val Leu Tyr Thr Asp Leu Leu Arg Arg Leu Arg 

235 240 

Ala val Arg Leu Arg Ser Gly Ala Lys Ala Leu Gly Lys Ala Arg Arg 
245 250 255 

Lys val Lys Val Leu Val Leu Val Val Leu Ala Val Cys Leu Leu Cys 

Zb5 270 

Trp Thr Pro Phe His Leu Ala Ser Val Val Ala Leu Thr Thr Asp Leu 
275 280 



285 



Pro Gin Thr Pro Leu Val He Ser Met Ser Tyr Val He Thr Ser Leu 
* 295 

Thr Tyr Ala Asn Ser Cys Leu Asn Pro Phe Leu Tyr Ala Phe Leu Asp 

^"^^ 320 

Asp Asn Phe Arg Lys Asn Phe Arg Ser He Leu Arg Cys 
325 

(174) INFORMATION FOR SEQ ID N0:173: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 1107 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic)' 



(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO:: 


173: 






ATGGTCCTTG 


AGGTGAGTGA 


CCACCAAGTG 


CTAAATGACG 


CCGAGGTTGC 


CGCCCTCCTG 


60 


GAGAACTTCA 


GCTCTTCCTA 


TGACTATGGA 


GAAAACGAGA 


GTGACTCGTG 


CTGTACCTCC 


120 


CCGCCCTGCC 


CACAGGACTT 


CAGCCTGAAC 


TTCGACCGGG 


CCTTCCTGCC 


AGCCCTCTAC 


180 


AGCCfCCTCT 


TTCTGCTGGG 


GCTGCTGGGC 


AACGGCGCGG 


TGGCAGCCGT 


GCTGCTGAGC 


240 


CGGCGGACAG 


CCCTGAGCAG 


CACCGACACC 


TTCCTGCTCC 


ACCTAGCTGT 


AGCAGACACG 


300 


CTGCTGGTGC 


TGACACTGCC 


GCTCTGGGCA 


GTGGACGCTG 


CCGTCCAGTG 


GGTCTTTGGC 


360 


TCTGGCCTCT 


GCAAAGTGGC 


AGGTGCCCTC 


TTCAACATCA 


ACTTCTACGC 


AGGAGCCCTC 


420 


CTGCTGGCCT 


GCATCAGCTT 


TGACCGCTAC 


CTGAACATAG 


TTCATGCCAC 


CCAGCTCTAC 


480 


CGCCGGGGGC 


CCCCGGCCCG 


CGTGACCCTC 


ACCTGCCTGG 


CTGTCTGGGG 


GCTCTGCCTG 


540 


CTTTTCGCCC 


TCCCAGACTT 


CATCTTCCTG 


TCGGCCCACC 


ACGACGAGCG 


CCTCAACGCC 


600 


ACCCACTGCC 


AATACAACTT 


CCCACAGGTG 


GGCCGCACGG 


CTCTGCGGGT 


GCTGCAGCTG 


660 


GTGGCTGGCT 


TTCTGCTGCC 


CCTGCTGGTC 


ATGGCCTACT 


GCTATGCCCA 


CATCCTGGCC 


720 


GTGCTGCTGG 


TTTCCAGGGG 


CCAGCGGCGC 


CTGCGGGCCA 


AGCGGCTGGT 


GGTGGTGGTC 


780 


GTGGTGGCCT 


TTGCCCTCTG 


CTGGACCCCC 


TATCACCTGG 


TGGTGCTGGT 


GGACATCCTC 


840 


ATGGACCTGG 


GCGCTTTGGC 


CCGCAACTGT 


GGCCGAGAAA 


GCAGGGTAGA 


CGTGGCCAAG 


900 


TCGGTCACCT 


CAGGCCTGGG 


CTACATGCAC 


TGCTGCCTCA ACCCGCTGCT 


CTATGCCTTT 


960 


GTAGGGGTCA AGTTCCGGGA GCGGATGTGG ATGCTGCTCT TGCGCCTGGG 


CTGCCCCAAC 


1020 


CAGAGAGGGC 


TCCAGAGGCA 


GCCATCGTCT 


TCCCGCCGGG 


ATTCATCCTG 


GTCTGAGACC 


1080 


TCAGAGGCCT 


CCTACTCGGG 


CTTGTGA 








1107 


(175) INFORMATION FOR 


SEQ ID NO: 174: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 368 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
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Met.Val Leu Glu Val Ser Asp His Gin Val Leu Asn Asp Ala Glu Val 
^ 10 15 

Ala Ala Leu Leu Glu Asn Phe Ser Ser Ser Tyr Asp Tyr Gly Glu Asn 
^° 25 30 

Glu Ser Asp Ser Cys Cys Thr Ser Pro Pro Cys Pro Gin Asp Phe Ser 
35 40 45 

Leu Asn Phe Asp Arg Ala Phe Leu Pro Ala Leu Tyr Ser Leu Leu Phe 
5° 55 . 60 

Leu Leu Gly Leu Leu Gly Asn Gly Ala Val Ala Ala Val Leu Leu Ser 
" ^0 75 80 

Arg Arg Thr Ala Leu Ser Ser Thr Asp Thr Phe Leu Leu His Leu Ala 
85 90 95 

val Ala Asp Thr Leu Leu Val Leu Thr Leu Pro Leu Trp Ala Val Asp 
100 105 

Ala Ala Val Gin Trp Val Phe Gly Ser Gly Leu Cys Lys Val- Ala Gly 
115 120 125 

Ala Leu Phe Asn lie Asn Phe Tyr Ala Gly Ala Leu Leu Leu Ala Cys 
130 135 140 

lie Ser Phe Asp Arg Tyr Leu Asn He Val His Ala Thr Gin Leu Tyr 

ISO X55 Jo 

Arg Arg Gly Pro Pro Ala Arg Val Thr Leu Thr Cys Leu Ala Val Trp 
165 170 175 

Gly Leu Cys Leu Leu Phe Ala Leu Pro Asp Phe He Phe Leu Ser Ala 
180 185 190 

His His Asp Glu Arg Leu Asn Ala Thr His Cys Gin Tyr Asn Phe Pro 
195 200 205 

Gin Val Gly Arg Thr Ala Leu Arg Val Leu Gin Leu Val Ala Gly Phe 
210 215 220 

Leu Leu Pro Leu Leu Val Met Ala Tyr Cys Tyr Ala His He Leu Ala 

230 235 240 

Val Leu Leu Val Ser Arg .Gly Gin Arg Arg Leu Arg Ala Lys Arg Leu 
245 250 255 



val val Val Val Val Val Ala Phe Ala Leu Cys Trp Thr Pro Tyr His 
260 265 270 

Leu val val Leu Val Asp He Leu Met Asp Leu Gly Ala Leu Ala Arg 
275 280 285 

Asn Cys Gly Arg Glu Ser Arg Val Asp Val Ala Lys Ser Val Thr Ser 
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290 295 300 

Gly Leu Gly Tyr Met His Cys Cys Leu Asn Pro Leu Leu Tyr Ala Phe 
305 310 315 320 

Val Gly Val Lys Phe Arg Glu Arg Met Trp Met Leu Leu Leu Arg Leu 
5 325 , 330 335 

Gly Cys Pro Asn Gin Arg Gly Leu Gin Arg Gin Pro Ser Ser Ser Arg 
340 345 350 

Arg Asp Ser Ser Trp Ser Glu Thr Ser Glu Ala Ser Tyr Ser Gly Leu 
355 360 365 

10 (176) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQX3ENCE CHARACTERISTICS: 

(A) LENGTH: 1074 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECtJLE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 

ATGGCTGATG ACTATGGCTC TGAATCCACA TCTTCCATGG AAGACTACGT TAACTTCAAC 60 

TTCACTGACT TCTACTGTGA GAAAAACAAT GTCAGGCAGT TTGCGAGCCA TTTCCTCCCA 120 

20 CCCTTGTACT GGCTCGTGTT CATCGTGGGT GCCTTGGGCA ACAGTCTTGT TATCCTTGTC 180 

TACTGGTACT GCACAAGAGT GAAGACCATG ACCGACATGT TCCTTTTGAA TTTGGCAATT 240 

GCTGACCTCC TCTTTCTTGT CACTCTTCCC TTCTGGGCCA TTGCTGCTGC TGACCAGTGG 300 

AAGTTCCAGA CCTTCATGTG CAAGGTGGTC AACAGCATGT ACAAGATGAA CTTCTACAGC 360 

TGTGTGTTGC TGATCATGTG CATCAGCGTG .GACAGGTACA TTGCCATTGC CCAGGCCATG 420 

25 AGAGCACATA CTTGGAGGGA GAAAAGGCTT. TTGTACAGCA AAATGGTTTG CTTTACCATC 480 

TGGGTATTGG CAGCTGCTCT CTGCATCCCA GAAATCTTAT ACAGCCAAAT CAAGGAGGAA 540 

TCCGGCATTG CTATCTGCAC CATGGTTTAC CCTAGCGATG AGAGCACCAA ACTGAAGTCA 600 

GCTGTCTTGA CCCTGAAGGT CATTCTGGGG TTCTTCCTTC CCTTCGTGGT CATGGCTTGC 660 

TGCTATACCA TCATCATTCA CACCCTGATA CAAGCCAAGA AGTCTTCCAA GCACAAAGCC 720 

30 AAGAAAGTGA CCATCACTGT CCTGACCGTC TTTGTCTTGT CTCAGTTTCC CTACAACTGC 780 

ATTTTGTTGG TGCAGACCAT TGACGCCTAT GCCATGTTCA TCTCCAACTG TGCCGTTTCC 840 
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ACCAACATTG ACATCTGCTT CCAGGTCACC CAGACCATCG CCTTCTTCCA. CAGTTGCCTG 900 
AACCCTGTTC TCTATC-mT TGTGGGTGAG AGATTCCXSCC GGGATCTCGT GAAAACCCTG 960 
AAGAACTTGG GTTGCATCAG CCAGGCCCAG TGGGITTCAT TTACAAGGAG AGAGGGAAGC 1020 
TTGAAGCTGT CGTCTATGTT GCTGGAGACA ACCTCAGGAG CACTCTCCCT CTGA 1074 
(177) INFORMATION FOR SEQ ID NO:176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

Met Ala ASP ASP Tyr Gly Ser Glu Ser Thr Ser Ser Met Glu Asp Tyr 
^ 10 15 

Val Asn Phe Asn Phe Thr Asp Phe Tyr Cys Glu Lys Asn Asn Val Arg 
20 25 30 

Gin Phe Ala Ser His Phe Leu Pro Pro Leu Tyr Trp Leu Val Phe He 
35 40 45 . 

val Gly Ala Leu Gly Asn Ser Leu Val He Leu Val Tyr Trp Tyr Cys 

6 0 

Thr Arg Val Lys Thr Met Thr Asp Met Phe Leu Leu Asn Leu Ala lie 



65 . 70 75 



80 



Ala ASP Leu Leu Phe Leu Val Thr Leu Pro Phe Trp Ala He Ala Ala 
85 90 33 

Ala Asp Gin Trp Lys Phe Gin Thr Phe Met Cys Lys Val Val Asn Ser 
100 105 

Met Tyr Lys Met Asn Phe Tyr Ser Cys Val Leu Leu lie Met Cys He 



125 



ser val Asp Arg Tyr He Ma He Ala Gin Ala Met Arg Ala His Thr 



140 



Trp Arg Glu Lys Arg Leu Leu Tyr Ser Lys Met Val Cys Phe Thr He 

155 



160 



Trp val Leu Ala Ala Ala Leu Cys He Pro Glu He Leu Tyr Ser Gin 
165 170 

lie Lys Glu Glu Ser Gly He Ala He Cys Thr Met Val Tyr Pro Ser 
180 185 



190 
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Asp Glu Ser Thr Lys Leu Lys Ser Ala Val Leu Thr Leu Lys Val lie 
195 200 205 

Leu Gly Phe Phe Leu Pro Phe Val Val Met Ala Cys Cys Tyr Thr lie 
210 215 ' 220 

lie lie His Thr Leu lie Gin Ala Lys Lys Ser Ser Lys His Lys Ala 
225 230 235 240 

Lys Lys Val Thr He Thr Val Leu Thr Val Phe Val Leu Ser Gin Phe 
245 250 255 

Pro Tyr Asn Cys He Leu Leu Val Gin Thr He Asp Ala Tyr Ala Met 
260 265 270 

Phe He Ser Asn Cys Ala Val Ser Thr Asn He Asp He Cys Phe Gin 
275 280 285 

Val Thr Gin Thr He Ala Phe Phe His Ser Cys Leu Asn Pro Val Leu 
290 295 300 

Tyr Val Phe Val Gly Glu Arg Phe Arg Arg Asp Leu Val Lys Thr Leu 
305 310 315 320 

Lys Asn Leu Gly Cys He Ser Gin Ala Gin Trp Val Ser Phe Thr Arg 
325 330 335 

Arg Glu Gly Ser Leu Lys Leu Ser Ser Met Leu Leu Glu Thr Thr Ser 
340 345 350 

Gly Ala Leu Ser Leu 

355 

(178) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1110 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

ATGGCCTCAT CGACCACTCG GGGCCCCAGG GTTTCTGACT TATTTTCTGG GCTGCCGCCG 60 

GCGGTCACAA CTCCCGCCAA CCAGAGCGCA GAGGCCTCGG CGGGCAACGG GTCGGTGGCT 120 

GGCGCGGACG CTCCAGCCGT CACGCCCTTC CAGAGCCTGC AGCTGGTGCA TCAGCTGAAG 180 

GGGCTGATCG TGCTGCTCTA CAGCGTCGTG GTGGTCGTGG GGCTGGTGGG CAACTGCCTG 240 

CTGGTGCTGG TGATCGCGCG GGTGCCGCGG CTGCACAACG TGACGAACTT CCTCATCGGC 300 
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AACCTGGCCT TGTCCGACGT GCTCATGTGC ACCGCCTGCG TGCCGCTCAC GCTGGCCTAT 360 

6CCTTC6AGC CACGC^GCTG GGTGTTCGGC G6CGGCCTGT GCCACCTGGT CTTCTTCCTG 420 

CAGCCGGTCA CC6TCTATGT GTCGGTGTTC ACGCTCACCA CCATCGCAGT GGACCGCTAC 480 

GTCGTGCTGG TGCACCCGCT GAGGCGCGCA TCTCGCTGCG CCTCAGCCTA CGCTGTGCTC 540 

GCCATCTGGG CGCTGTCCGC GGTGCTGGCG CTGCCGCCCG CCGTGCACAC CTATCACGTC 600 

GAGCTCAAGC CGCACGAC6T GCGCCTCTGC GAGGAGTTCT GGGGCTCCCA GGAGCGCCAG 660 

CGCCAGCTCT ACGCCTGGGG GCTGCTGCTG GTCACCTACC TGCTCCCTCT GCTGGTCATC 720 

CTCCTGTCTT ACGTCCGGGT GTCAGTGAAG CTCCGCAACC GCGTGGTGCC GGGCTGCGTG 780 

ACCCAGAGCC AGGCCGACTG GGACCGCGCT CGGCGCCGGC GCACCAAATG CTTGCTG6TG 840 

GTGGTCGTGG TGGTGTTCGC CGTCTGCTGG CTGCCGCTCC ACGTCTTCAA CCTCCTGC6G 900 

GACCTCGACC CCCACGCCAT CGACCCTTAC GCCTTTGGGC TGGTGCAGCT GCTCTGCCAC 960 

TGGCXCGCCA TGAGTTCGGC CTGCTACAAC CCCTTCATCT ACGCCTGGCT GCACGACAGC 1020 

TTCCGCGAGG AGCTGCGCAA ACTGTTGGTC QCTTGGCCCC GCAAGATAGC CCCCCATGGC 1080 

CAGAATAT6A CCGTCAGCGT GGTCATCTGA ^^^^ 
(179) INFORMATION FOR SBQ ID NO: 178: 

(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 369 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

Met Ala Ser Ser Thr Thr Arg Gly Pro Arg Val Ser Asp Leu Phe Ser 
5 10 



15 



Gly Leu Pro Pro Ala Val Thr Thr Pro Ala Asn Gin Ser Ala Glu Ala 
2° 25 30 

ser Ala Gly Asn Gly Ser Val Ala Gly Ala Asp Ala Pro Ala Val Thr 
35 40 45 

Pro Phe Gin Ser Leu Gin Leu Val His Gin Leu Lys Gly Leu He Val 
50 55 



60 



Leu Leu Tyr Ser Val Val Val Val Val Gly Leu Val Gly Asn Cys Leu 
03 70 



75 



80 
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Leu Val Leu Val He Ala Arg Val Pro Arg Leu His Asn Val Thr Asn 
85 90 95 

Phe Leu He Gly Asn Leu Ala Leu Ser Asp Val Leu Met Cys Thr Ala 
100 105 110 

Cys Val Pro Leu Thr Leu Ala Tyr Ala- Phe Glu Pro Arg Gly Trp Val 
115 120 125 

Phe Gly Gly Gly Leu Cys His Leu Val Phe Phe Leu Gin Pro Val Thr 
130 135 140 

Val Tyr Val Ser Val Phe Thr Leu Thr Thr He Ala Val Asp Arg Tyr 

150 155 160 

Val Val Leu Val His Pro Leu Arg Arg Ala Ser Arg Cys Ala Ser Ala 
165 170 175 

Tyr Ala Val Leu Ala He Trp Ala Leu Ser Ala Val Leu Ala Leu Pro 
180 185 190 

Pro Ala Val His Thr Tyr His Val Glu Leu Lys Pro His Asp Val Arg 
195 200 205 

Leu Cys Glu Glu Phe Trp Gly Ser Gin Glu Arg Gin Arg Gin Leu Tyr 
210 215 220 

Ala Trp Gly Leu Leu Leu Val Thr Tyr Leu Leu Pro Leu Leu Val He 
225 230 235 240 

Leu Leu Ser Tyr Val Arg Val Ser Val Lys Leu Arg Asn Arg Val yal 
245 250 255 

Pro Gly Cys Val Thr Gin Ser Gin Ala Asp Trp Asp Arg Ala Arg Arg 
260 265 270 

Arg Arg Thr Lys Cys Leu Leu Val Val Val Val Val Val Phe Ala Val 
275 280 285 

Cys Trp Leu Pro Leu His Val Phe Asn Leu Leu Arg Asp Leu Asp Pro 
290 295 300 

His Ala He Asp Pro Tyr Ala Phe Gly Leu Val Gin Leu Leu Cys His 
305 310 315 320 

Trp Leu Ala Met Ser Ser Ala Cys Tyr Asn Pro Phe He Tyr Ala Trp 
325 330 335 

Leu His Asp Ser Phe Arg Glu Glu Leu Arg Lys Leu Leu Val Ala Trp 
340 345 350 

Pro Arg Lys He Ala Pro His Gly Gin Asn Met Thr Val Ser Val Val 
355 360 365 

He 
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(180) INFORMATION FOR SEQ ID NO: 179; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1083 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

10 ATGGACCCAG AAGAAACTTC AGTTTAITTO GATTATTACT ATGCTACGAG CCCAAACTCT 60 

. GACATCAGGG AGACCCACTC CCATGTTCCT TACACCTCTG TCTTCCTTCC AGTCTTTTAC 120 

ACAGCTGTGT TCCTGACTGG AGTGCTGGGG AACCTTGTTC TCATGGGAGC GTTGCATTTC 180 

AAACCCGGCA GCCGAAGACT GATCGACATC TTTATCATCA ATCTGGCTGC CTCTCACTTC 240 

ATTTTTCTTG TCACAITCCC TCTCT^GGTG GATAAAGAAG CATCTCTAGG ACTGTGGAGG 300 

5 ACGGGCTCCT TCCTGT^CAA AGGGAGCTCC TACATGATCT CCGTCAATAT GCACTGCAGT 360 

GTCCTCCTGC TCACITGCAT GAGTGrTGAC CGCTACCTGG CCATTGTGTG GCCAGTCGTA 420 

TCCAGGAAAT TCAGAAGGAC AGACTGTGCA TATGTAGTCT GTGCCAGCAT CTGGTTTATC 480 

TCCTGCCT6C 1.3GGGTTGCC TACTCTTCTG TCCAGGGAGC TCACGCTGAT TGATGATAAG 540 

CCATACTGTG CAGAGAAAAA GGCAACTCCA ATTAAACTCA TATGGTCCCT GGTGGCCTTA 600 

) ^™ACCT rrrTTGTCCC TTTGrrGAGC ATTGTGACCT GCTACTGTTG CATTGCAAGG 660 

AAGCTGTGTG CCCATTACCA GCAATCAGGA AAGCACAACA AAAAGCTGAA GAAATCTAAG 720 

AAGATCATCT TTAirGTCGT GGCAGCCrPT CTTGTCTCCT GGCTGCCCTT CAATACITTC 780 

AAGITCCTGG CCAOrGTCTC TGGGTTGCGG CAAGAACACT ATTTACCCTC AGCTATTCTT 840 

CAGCTTGGTA TGGAGGTGAG TGGACCCITG GCAl^TCCCA ACAGCTGTGT CAACCCTTTC 900 

ATTTACTATA TCTTCGACAG CTACATCCGC CGGGCCATTG TCCACTGCrP GTGCCCTTGC 960 

CTGAAAAACT ATGACITTGG GAGTAGCACT GAGACATCAG ATAGTCACCT CACTAAGGCT 1020 
CTCTCCACCT TCATTCATGC AGAAGATTTT . GCCAGGAGGA GGAAGAGGTC TGTOTCACTC 1080 
TAA 

1083 

(181) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 360 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:180: 

Met Asp Pro Glu Glu Thr Ser Val Tyr Leu Asp Tyr Tyr Tyr Ala Thr 
1.5 10 15 

Ser Pro Asn Ser Asp lie Arg Glu Thr His Ser His Val Pro Tyr Thr 
20 25 30 

Ser Val Phe Leu Pro Val Phe Tyr Thr Ala Val Phe Leu Thr Gly Val 
35 , 40 45 

Leu Gly Asn Leu Val Leu Met Gly Ala Leu His Phe Lys Pro Gly Ser 
50 55 60 

Arg Arg Leu He Asp He Phe He He Asn Leu Ala Ala Ser Asp Phe 
65 70 75 80 

He Phe Leu Val Thr Leu Pro Leu Trp Val Asp Lys Glu Ala Ser Leu 
85 90 95 

Gly Leu Trp Arg Thr Gly Ser Phe Leu Cys Lys Gly Ser Ser Tyr Met 
100 105 110 

He Ser Val Asn Met His Cys Ser Val Leu Leu Leu Thr Cys Met Ser 
115 120 125 

Val Asp Arg Tyr Leu Ala He Val Trp Pro Val Val Ser Arg Lys Phe 
130 135 140 

Arg Arg Thr Asp Cys Ala Tyr Val Val Cys Ala Ser He Trp Phe He 
145 150 155 160 

Ser Cys Leu Leu Gly Leu Pro Thr Leu Leu Ser Arg Glu Leu Thr Leu 
165 170 175 

He Asp Asp Lys Pro Tyr Cys Ala Glu Lys Lys Ala Thr Pro He Lys 
180 185 190 

Leu He Trp Ser Leu Val Ala Leu He Phe Thr Phe Phe Val Pro Leu 
195 200 205 

Leu Ser He Val Thr Cys Tyr Cys Cys He Ala Arg Lys Leu Cys Ala 
210 215 220 

His Tyr Gin Gin Ser Gly Lys His Asn Lys Lys Leu Lys Lys Ser Lys 
225 230 235 240 



Lys He He Phe He Val Val Ala Ala Phe Leu Val 
245 250 



Ser Trp Leu Pro 
255 
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Phe Asn Thr Phe Lys Phe Leu Ala He Val Ser Gly Leu Arg Gin Glu 
260 265 270 

His Tyr Leu Pro Ser Ala He Leu Gin Leu Gly Met Glu Val Ser Gly 
275 280 285 

5 Pro Leu Ala Phe Ala Asn Ser Cys Val Asn Pro Phe He Tyr Tyr He 

295 300 

Phe ASP ser Tyr He Arg Arg Ala He Val His Cys Leu Cys Pro Cys 

310 315 320 

Leu Lys Asn Tyr Asp Phe Gly Ser Ser Thr Glu Thr Ser Asp Ser His 
325 330 335 

Leu Thr Lys Ala Leu Ser Thr Phe He His Ala Glu Asp Phe Ala Arg 
340 345 

Arg Arg Lys Arg Ser Val Ser Leu 
355 360 

15 (182) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 1020 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 



ATGAATGGCC 


TTGAAGTGGC 


TCCCCCAGGT CTGATCACCA ACTTCTCCCT 


GGCCACGGCA 


60 


GAGCAATGTG 


GCCAGGAGAC 


GCCACTGGAG AACATGCTGT TCGCCTCCTT 


CTACCTTCTG 


120 


GATTTTATCC 


TGGCTTTAGT 


TGGCAATACC.CTGGCTCTGT GGCTTTTCAT 


CCGAGACCAC 


180 


AAGTCCGGGA 


CCCCGGCCAA 


CGTGTTCCTG ATGCATCTGG CCGTGGCCGA 


CTTGTCGTGC 


240 


GTGCTGGTCC 


TGCCCACCCG 


CCTGGTCTAC CACTTCTCTG GGAACCACTG 


GCCATTTGGG 


300 


GAAATCGCAT 


GCCGTCTCAC 


CGGCTTCCTC TTCTACCTCA ACATGTACGC 


CAGCATCTAC 


360 


TTCCTCACCT 


GCATCAGCGC 


CGACCGTTTC CTGGCCATTG TGCACCCGGT 


CAAGTCCCTC 


420 


AAGCTCCGCA 


GGCCCCTCTA 


CGCACACCTG GCCTGTGCCT TCCTGTGGGT 


GGTGGTGGCT . 


480 


GTGGCCATGG 


CCCCGCTGCT 


GGTGAGCCCA CAGACCGTGC A6ACCAACCA 


CACGGTGGTC 


540 


TGCCTGCAGC 


TGTACCGGGA GAAGGCCTCC CACCATGCCC TGGTGTCCCT GGCAGTGGCC 


600 


TTCACCTTCC 


CGTTCATCAC 


CACGGTCACC TGCTACCTGC TGATCATCCG 


CAGCCTGCGG 


660 
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CAGGGCCTGC GTGTGGAGAA GCGCCTCAAG ACCAAGGCAA AACGCATGAT CGCCATAGTG 720 

CTGGCCATCT TCCTGGTCTG CTTCGTGCCC TACCACGTCA ACCGCTCCGT CTACGTGCTG 780 

CACTACCGCA GCCATGGGGC CTCCTGCGCC ACCCAGCGCA TCCTGGCCCT GGCAAACCGC 84 0 

ATCACCTCCT GCCTCACCAG CCTCAACGGG GCACTCGACC CCATCATGTA TTTCTTCGTG 900 

GCTGAGAAGT TCCGCCACGC CCTGTGCAAC TTGCTCTGTG GCAAAAGGCT CAAGGGCCCG 960 

CCCCCCAGCT. TCGAAGGGAA AACCAACGAG AGCTCGCTGA GTGCCAAGTC AGAGCTGTGA 1020 
(183) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 339 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 

Met Asn Gly Leu Glu Val Ala Pro* Pro Gly Leu He Thr Asn Phe Ser 
15 10 15 

Leu Ala Thr Ala Glu Gin Cys Gly Gin Glu Thr Pro Leu Glu Asn Met 
20 25 30 

Leu Phe Ala Ser Phe Tyr Leu Leu Asp Phe He Leu Ala Leu Val Gly 
35 40 45 

Asn Thr Leu Ala Leu Trp Leu Phe He Arg Asp His Lys Ser Gly Thr 
50 55 60 . 

Pro Ala Asn Val Phe Leu Met His Leu Ala Val Ala Asp Leu Ser Cys 
65 70 75 80 

Val Leu Val Leu Pro Thr Arg Leu Val Tyr His Phe Ser Gly Asn His 
85 90 95 

Trp Pro Phe Gly Glu He Ala Cys Arg Leu Thr Gly Phe Leu Phe Tyr 
100 105 110 

Leu Asn Met Tyr Ala Ser He Tyr Phe Leu Thr Cys He Ser Ala Asp 
115 120 125 

Arg Phe Leu Ala He Val His Pro Val Lys Ser Leu Lys Leu Arg Arg 
130 135 140 

Pro Leu Tyr Ala His Leu Ala Cys Ala Phe Leu Trp Val Val Val Ala 
145 150 155 160 

Val Ala Met Ala Pro Leu Leu Val Ser Pro Gin Thr Val Gin Thr Asn 



10 



15 



20 
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165 



170 



175 

His Thr Val Val Cys I.u Gin Leu Tyr Arg Glu Lys Ala Sar His His 
"0 185 

Ala Leu val Ser Leu Ala Val Ala Phe rtr Phe Pro Phe lie Thr Thr 

200 205 

val Thr cys Tyr Leu Leu He He Arg Ser Leu Arg Gin Gly Leu Arg 



215 220 



val Glu Lys Arg Leu Lys Thr Lys Ala Lys Arg Met He Ala lie Val 

230 



235 



240 



Leu Ala He Phe Leu Val Cys Phe Val Pro Tyr His Val 



nJI """"^ ^^-^ Asn Arg Ser 

245 ocn 



250 255 



val Tyr Val Leu His Tyr Arg Ser His Gly Ala Ser Cys Ala Thr 
260 



265 



Gin 



Arg lie Leu Ala Leu Ala Asn Arg He Thr Ser Cys Leu Thr Ser Leu 

280 285 

Asn Gly Ala Leu Asp Pro He Met Tyr Phe Phe Val Ala Glu Lys Phe 



300 

Arg His Ala Leu Cys Asn Leu Leu Cys Gly Lys Arg Leu Lys Gly Pro 

315 320 

Gly Lys Thr Asn Glu Ser Ser Leu fi»^ ai = 

325 



Pro Pro ser Phe Glu Gly Lys Thr Asn Glu Ser Ser Leu Ser Ala Lys 



335 



330 

Ser Glu Leu 

(183) INFORMATION FOR SEQ ID NO : 183: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 996 base pairs 

(B) TYPE: nucleic acid 

. (C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

ATGATCACCC TGAACAATCA AGATCAACCT GTCCCTTTTA ACAGCTCACA TCCAGATGAA 60 

TACAAAATTG CAGCCCTTGT CTTCTATAGC TGTATCTTCA TAATTGGATT ATITGTTAAC 120 

ATCACTGCAT TATGGGO^T CAGT^TACC ACCAAGAAGA GAACCACGGT AACCATCTAT 180 

35 ^^^TG TGGCAOTAGT GGAC^^ATA T^ATAATGA CTTTACCCTT TC^^^ ,,0 
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TATTATGCAk AAGATGAATG GCCATTTGGA GAGTACTTCT GCCAGATTCT TGGAGCTCTC 300 

ACAGTGTTTT ACCCAAGCAT T6CTTTATGG CTTCTTGCCT TTATTAGTGC TGACAGATAC 360 

ATGGCCATTG TACAGCCGAA GTACGCCAAA GAACTTA7VAA ACACGTGCAA AGCCGTGCTG 420 

GCGTGTGTGG GAGTCTGGAT AATGACCCTG ACCACGACCA CCCCTCTGCT ACTGCTCTAT 480 

AAAGACCCAG ATAAAGACTC CACTCCCGCC ACCTGCCTCA AGATTTCTGA CATCATCTAT 540 

CTAAAAGCTG TGAACGT6CT GAACCTCACT CGACTGACAT TTTTTTTCTT GATTCCTTTG 600 

TTCATCATGA TTGGGTGCTA CTTGGTCATT ATTCATAATC TCCTTCACGG CAGGACGTCT 660 

AAGCTGAAAC CCAAAGTCAA GGAGAAGTCC AAAAGGATCA TCATCACGCT GCTGGTGCAG 720 

GTGCTCGTCT GCTTTATGCC CTTCCACATC TGTTTCGCTT TCCTGATGCT GGGAACGGGG ' 780 

GAGAATAGTT ACAATCCCTG GGGAGCCTTT ACCACCTTCC TCATGAACCT CAGCACGTGT 840 

CTGGATGTGA TTCTCTACTA CATCGTTTCA AAACAATTTC AGGCTCGAGT CATTAGTGTC 900 

ATGCTATACC GTAATTACCT TCGAAGCATG CGCAGAAAAA GTTTCCGATC TGGTAGTCTA 960 

AGGTCACTAA GCAATATAAA CAGTGAAATG TTATGA 996 
(185) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 331 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 184: 

Met lie Thr Leu Asn Asn Gin Asp Gin Pro Val Pro Phe Asn Ser Ser 
15 10 15 

His Pro Asp Glu Tyr Lys He Ala Ala Leu Val Phe Tyr Ser Cys He 
20 25 30 

Phe He He Gly Leu Phe Val Asn He Thr Ala Leii Trp Val Phe Ser 
35 40 45 

Cys Thr Thr Lys Lys Arg Thr Thr Val Thr He Tyr Met Met Asn Val 
50 55 60 

Ala Leu Val Asp Leu He Phe He Met Thr Leu Pro Phe Arg Met Phe 
65 70 75 80 

Tyr Tyr Ala Lys Asp Glu Trp Pro Phe Gly Glu Tyr Phe Cys Gin He 
85 90 95 
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Leu Gly Ala Leu Thr Val Phe Tyr Pro Ser He Ala Leu Trp Leu Leu 

105 

Ala Phe He Ser Ala Asp Arg Tyr Met Ala He Val Gin Pro Lys Tyr 
lis 120 125 

Ala Lys Glu Leu Lys Asn Thr Cys Lys Ala Val Leu Ala Cys Val Gly 
130 135 140 

Val Trp He Met Thr Leu Thr Thr Thr Thr Pro Leu Leu Leu Leu Tyr 

"0 155 160 

Lys Asp Pro Asp Lys Asp Ser Thr Pro Ala Thr Cys Leu Lys He Ser 
165 170 175 

Asp He He Tyr Leu Lys Ala Val Asn Val Leu Asn Leu Thr Arg Leu 

185 190 

Thr Phe Phe Phe Leu He Pro Leu Phe He Met He Gly Cys Tyr Leu 
195 200 205 

Val He He His Asn Leu Leu His Gly Arg Thr Ser Lys Leu Lys Pro 

215 220 



Lys Val Lys Glu Lys Ser Lys Arg He He He Thr Leu Leu Val Gin 

240 



225 230 235 



val Leu Val Cys Phe Met Pro Phe His He Cys Phe Ala Phe Leu Met 
245 250 255 

Leu Gly Thr Gly Glu Asn Ser Tyr Asn Pro Trp Gly Ala Phe Thr Thr 
260 265 270 

Phe Leu Met Asn Leu Ser Thr Cys Leu Asp Val He Leu Tyr Tyr He 
275 280 285 

val ser Lys Gin Phe Gin Ala Arg Val He Ser Val Met Leu Tyr Arg 
230 295 300 ^ 



Asn Tyr Leu Arg Ser Met Arg Arg Lys Ser Phe Arg Ser Gly Ser Leu 

320 



305 310 315 



Arg Ser Leu Ser Asn He Asn Ser Glu Met Leu 
325 330 

(186) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1077 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

ATGCCCTCTG TGTCTCCAGC GGGGCCCTCG GCCGGGGCAG TCCCCAATGC CACCGCAGTG 60 

ACAACAGTGC GGACCAATGC CAGCGGGCTG GAGGTGCCCC TGTTCCACCT GTTTGCCCGG 120 

CTGGACGAGG AGCTGCATGG CACCTTCCCA GGCCTGTGCG TGGCGCTGAT GGCGGTGCAC 180 

GGAGCCATCT TCCTGGCAGG GCTGGTGCTC AACGGGCTGG CGCTGTACGT CTTCTGCTGC 240 

CGCACCCGGG CCAAGACACC CTCAGTCATC TACACCATCA ACCTGGTGGT GACCGATCTA 300 

CTGGTAGGGC TGTCCCTGCC CACGCGCTTC GCTGTGTACT ACGGCGCCAG GGGCTGCCTG 360 

CGCTGTGCCT TCCCGCACGT CCTCGGTTAC TTCCTCAACA TGCACTGCTC CATCCTCTTC 420 

CTCACCTGCA TCTGCGTGGA CCGCTACCTG GCCATCGTGC GGCCCGAAGG CTCCCGCCGC 480 

TGCCGCCAGC CTGCCTGTGC CAGGGCCGTG TGCGCCTTCG TGTGGCTGGC CGCCGGTGCC 54 0 

GTCACCCTGT CGGTGCTGGG CGTGACAGGC AGCCGGCCCT GCTGCCGTGT CTTTGCGCTG 600 

ACTGTCCTGG AGTTCCTGCT GCCCCTGCTG GTCATCAGCG TGTTTACCGG CCGCATCATG 660 

TGTGCACTGT CGCGGCCGGG TCTGCTCCAC CAGGGTCGCC AGCGCCGCGT GCGGGCCAAG 720 

CAGCTCCTGC. TCACGGTGCT CATCATCTTT CTCGTCTGCT TCACGCCCTT CCACGCCCGC 780 

CAAGTGGCCG TGGCGCTGTG GC^CGACATG CCACACCACA CGAGCCTCGT GGTCTACCAC 840 

GTGGCCGTGA CCCTCAGCAG CCTCAACAGC TGCATGGACC CCATCGTCTA CTGCTTCGTC 900 

ACCAGTGGCT TCCAGGCCAC CGTCCGAGGC CTCTTCGGCC AGCACGGAGA GCGTGAGCCC 960 

AGCAGCGGTG ACGTGGTCAG CATGCACAGG AGCTCCAAGG GCTCAGGCCG TCATCACATC 1020 

CTCAGTGCCG GCCCTCACGC CCTCACCCAG GCCCTGGCTA ATGGGCCCGA GGCTTAG 1077 
(187) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 358 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:186: 

Met Pro Ser Val Ser Pro Ala Gly Pro Ser Ala Gly Ala Val Pro Asn 
15 10 15 

Ala Thr Ala Val Thr Thr Val Arg Thr Asn Ala Ser Gly Leu Glu Val 
20 25 30 
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Pro Leu Phe His Leu Phe Ala Arg Leu Asp Glu Glu Leu His Gly Thr 
35 40 45 

Phe Pro Gly Leu Cys Val Ala Leu Met Ala Val His Gly Ala He Phe 

-5 60 



Leu Ala Gly Leu Val Leu Asn Gly Leu Ala Leu Tyr Val- Phe Cys Cys 

75 /o 

Arg Thr Arg Ala Lys Thr Pro Ser Val He Tyr Thr He Asn Leu Val 
85 90 95 

Val Thr Asp Leu Leu Val Gly Leu Ser Leu Pro Thr Arg Phe Ala Val 
100 105 110 



Tyr Tyr Gly Ala Arg Gly Cys Leu Arg Cys Ala Phe Pro His Val Leu 

125 



115 120 



Gly Tyr Phe Leu Asn Met His Cys Ser He Lea Phe Leu Thr Cys He 

135 



cys Val Asp Arg Tyr Leu Ala He Val Arg Pro Glu Gly Ser Arg Ala 

"° 155 160 

cys Arg Gin Pro Ala Cys Ala Arg Ala Val Cys Ala Phe Val Trp Leu 
165 170 175 

Ala Ala Gly Ala Val Thr Leu Ser Val Leu Gly Val Thr Gly Ser Arg 



190 



Pro cys cys Arg Val Phe Ala Leu Thr Val Leu Glu Phe Leu Leu Pro 
195 200 205 

Leu Leu val He Ser Val Phe Thr Gly Arg He Met Cys Ala Leu Ser 

215 220 

Arg Pro Gly Leu Leu His Gin Gly Arg Gin Arg Arg Val Arg Ala Lys 

230 235 240 

Gin Leu Leu Leu Thr Val Leu He He Phe Leu Val Cys Phe Thr Pro 
245 250 255 

Phe His Ala Arg Gin Val Ala Val Ala Leu Trp Pro Asp Met Pro His 
260 265 270 

His Thr ser Leu Val Val Tyr His Val Ala Val Thr Leu Ser Ser Leu 
275 280 285 

Asn ser Cys Met Asp Pro He Val Tyr Cys Phe Val Thr Ser Gly Phe 

295 

Gin Ala «.r Val Arg Gly Leu Phe Gly Gin His Gly Glu Arg Glu Pro 

315 320 
ser ser Gly Asp Val Val Ser Met His Arg Ser Ser Lys Gly Ser Gly 
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325 330 335 

Arg His His He Leu Ser Ala Gly Pro His Ala Leu Thr Gin Ala Leu 
340 345 350 

Ala Asn Gly Pro Glu Ala 
5 355 

(188) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 1050 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 7: 

ATGAACTCCA CCTTGGATGG TAATCAGAGC AGCCACCCTT TTTGCCTCTT GGCATTTGGC 60 

15 TATTTGGAAA CTGTCAATTT TTGCCTTTTG GAAGTATTGA TTATTGTCTT TCTAACTGTA 120 

TTGATTATTT CTGGCAACAT CATTGTGATT TTTGTATTTC ACTGTGCACC TTTGTTGAAC 180 

CATCACACTA CAAGTTATTT TATCCAGACT ATGGCATATG CTGACCTTTT TGTTGGGGTG 240 

AGCTGCGTGG TCCCTTCTTT ATCACTCCTC CATCACCCCC TTCCAGTAGA GGAGTCCTTG 300 

ACTTGCCAGA TATTTGGTTT TGTAGTATCA GTTCTGAAGA GCGTCTCCAT GGCTTCTCTG 360 

20 GCCTGTATCA GCATTGATAG ATACATTGCC ATTACTAAAC CTTTAACCTA TAATACTCTG 420 

GTTACACCCT GGAGACTACG CCTGTGTATT TTCCTGATTT GGCTATACTC GACCCTGGTC 480 

TTCCTGCCTT CCTTTTTCCA CTGGGGCAAA CCTGGATATC ATGGAGATGT GTTTCAGTGG 540 

TGTGCGGAGT CCTGGCACAC CGACTCCTAC TTCACCCTGT TCATCGTGAT GATGTTATAT 600 

GCCCCAGCAG CCCTTATTGT CTGCTTCACC TATTTCAACA TCTTCCGCAT CTGCCAACAG 660 

25 CACACAAAGG ATATCAGCGA AAGGCAAGCC CGCTTCAGCA GCCAGAGTGG GGAGACTGGG 720 

GAAGTGCAGG CCTGTCCTGA TAAGCGCTAT AAAATGGTCC TGTTTCGAAT CACTAGTGTA 780 

TTTTACATCC TCTGGTTGCC ATATATCATC TACTTCTTGT TGGAAAGCTC CACTGGCCAC 840 

AGCAACCGCT TCGCATCCTT CTTGACCACC TGGCTTGCTA TTAGTAACAG TTTCTGCAAC 900 

TGTGTAATTT ATAGTCTCTC CAACAGTGTA TTCCAAAGAG GACTAAAGCG CCTCTCAGGG 960 

30 GCTATGTGTA CTTCTTGTGC AAGTCAGACT ACAGCCAACG ACCCTTACAC AGTTAGAAGC 1020 

AAAGGCCCTC TTAATGGATG TCATATCTGA 1050 



wo 00/22129 



PCT/US99/23938 



140 



(189) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 349 amino acids 

(B) TYPE: amino 'acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 
Met Asn ser Thr Leu Asp Gly Asn Gin Ser Ser His Pro Phe Cys Leu 

Leu Ala Phe Gly Tyr Leu Glu Thr Val Asn Phe Cys Leu Leu Glu Val 
20 25 30 

Leu He He Val Phe Leu Thr Val Leu He He Ser Gly Asn He He 
35 40 45 

Val He Phe Val Phe His Cys Ala Pro Leu Leu Asn His His Thr Thr 
50 55 60 

Ser Tyr Phe He Gin Thr Met Ala Tyr Ala Asp Leu Phe Val Gly Val 
" 70 75 80 

Ser Cys Val Val Pro Ser Leu Ser Leu Leu His His Pro Leu Pro Val 
85 90 95 

Glu Glu Ser Leu Thr Cys Gin He Phe Gly Phe Val Val Ser Val Leu 
100 105 110 

Lys Ser Val Ser Met Ala Ser Leu Ala Cys He Ser He Asp Arg Tyr 
115 . 120 125 

He Ala He Thr Lys Pro Leu Thr Tyr Asn Thr Leu Val Thr Pro Trp 
130 135 140 

Arg Leu Arg Leu Cys He Phe Leu He Trp Leu Tyr Ser Thr Leu Val 

150 155 160 

Phe Leu Pro Ser Phe Phe His Trp Gly Lys Pro Gly Tyr His Gly Asp 
165 170 175 

Val Phe Gin Trp Cys Ala Glu Ser Trp His Thr Asp Ser Tyr Phe Thr 
180 185 190 

Leu Phe He Val Met Met Leu Tyr Ala Pro Ala Ala Leu He Val Cys 
195 200 205 

Phe Thr Tyr Phe Asn He Phe Arg He Cys Gin Gin His Thr Lys Asp 
210 215 220 y i» 
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He Ser Glu Arg Gin Ala Arg Phe Ser Ser Gin Ser Gly Glu Thr Gly 
225 230 235 240 

Glu Val Gin Ala Cys Pro Asp Lys Arg Tyr Lys Met Val Leu Phe Arg 
245 250 255 

He Thr Ser Val Phe Tyr He Leu Trp Leu Pro Tyr He He Tyr Phe 
260 265 270 

Leu Leu Glu Ser Ser Thr Gly His Ser Asn Arg Phe Ala Ser Phe Leu 
275 280 . 285 

Thr Thr Trp Leu Ala He Ser Asn Ser Phe Cys Asn Cys Val He Tyr 
290 295 300 

Ser Leu Ser Asn Ser Val Phe Gin Arg Gly Leu Lys Arg Leu Ser Gly 
305 310 315 320 

Ala Met Cys Thr Ser Cys Ala Ser Gin Thr Thr Ala Asn Asp Pro Tyr 
325 330 335 

Thr Val Arg Ser Lys Gly Pro Leu Asn Gly Cys His He 
340 345 

(190) INFORMATION FOR SEQ ID NO:189: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1302 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE - TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

ATGTGTTTTT CTCCCATTCT GGAAATCAAC ATGCAGTCTG AATCTAACAT TACAGTGCGA 60 

GATGACATTG ATGACATCAA CACCAATATG TACCAACCAC TATCATATCC GTTAAGCTTT 120 

CAAGTGTCTC TCACCGGATT TCTTATGTTA GAAATTGTGT TGGGACTTGG CAGCAACCTC 180 

ACTGTATTGG TACTTTACTG CATGAAATCC AACTTAATCA ACTCTGTCAG TAACATTATT 240 

ACAATGAATC TTCATGTACT TGATGTAATA ATTTGTGTGG GATGTATTCC TCTAACTATA 300 

GTTATCCTTC TGCTTTCACT GGAGAGTAAC ACTGCTCTCA TTTGCTGTTT CCATGAGGCT 360 

TGTGTATCTT TTGCAAGTGT CTCAACAGCA ATCAACGTTT TTGCTATCAC TTTGGACAGA 420 

TATGACATCT CTGTAAAACC TGCAAACCGA ATTCTGACAA TGGGCAGAGC TGTAATGTTA 480 

ATGATATCCA TTTGGATTTT TTCTTTTTTC TCTTTCCTGA TTCCTTTTAT TGAGGTAAAT 540 

TTTTTCAGTC TTCAAAGTGG AAATACCTGG GAAAACAAGA CACTTTTATG TGTCAGTACA 600 
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AATGAATACT ACACTGAACT GGGAATGTAT TATCACCTCT TAGTACAQAT CCCAATATTC 660 

TTTTTCACTG TTGTAGTAAT GTTAATCACA TACACCAAAA TACTTCAGGC TCTTAATATT 720 

CGAATAGGCA CAAGATTTTC AACAGGGCAG AAGAAGAAAG CAAGAAAGAA AAAGACAATT 780 

TCTCTAACCA CACAACATGA GGCTACAGAC ATGTCACAAA GCAGTCGTGG GAGAAATX3TA 840 

GTCTTTGGTG TAAGAACTTC AGTTTCT6TA ATAATTGCCC TCCGGCGAGC TGTGAAACX3A 900 

CACCGTGAAC GACGAGAAAG ACAAAAGAGA GTCAAGAGGA TGTCTTTATT GATTATTTCT 960 

ACATTTCTTC TCTGCTGGAC ACCAATTTCT GITTTAAATA CCACCATTTT ATGTTTAGGC 1020 

CCAAGTGACC TTTTAGTAAA ATTAAGATTG TGTTTTTTAG TCATGGCTTA TGGAACAACT 1080 

ATATTTCACC CTCTATTATA TGCATTCACT AGACAAAAAT TTCAAAAGGT CTTGAAAAGT 1140 

AAAATGAAAA AGCGAGTTGT TTCTATAGTA GAAGCTGATC CCCTGCCTAA TAATGCTGTA 1200 

ATACACAACT CTTGGATAGA TCCCAAAAGA AACAAAAAAA TTACCTTTCA AGATAGTGAA 1260 

ATAAGAGAAA AACGTTTAGT GCCTCAGGTT GTCACAGACT AG 1302 
(191) INFORMATION FOR SEQ ID MO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 433 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

Met Cys Phe Ser Pro He Leu Glu He Asn Met Gin Ser Glu Ser Asn 
1 5 . 10 15 

He Thr Val Arg Asp Asp He Asp Asp He Asn Thr Asn Met Tyr Gin 
20 25 30 

Pro Leu Ser Tyr Pro Leu Ser Phe Gin Val Ser Leu Thr Gly Phe Leu 
35 40 45 

Met Leu Glu He Val Leu Gly Leu Gly Ser Asn Leu Thr Val Leu Val 
50 55 60 

Leu Tyr Cys Met Lys Ser Asn Leu He Asn Ser Val Ser Asn He He 



65 70 75 



80 



Thr Met Asn Leu His Val Leu Asp Val He He Cys Val Gly Cys He 
85 90 35 

Pro Leu Thr He Val He Leu Leu Leu Ser Leu Glu Ser Asn Thr Ala 
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100 105 110 

Leu lie Cys Cys Phe His Glu Ala Cys Val Ser Phe Ala Ser Val Ser 
115 120 125 

Thr Ala lie Asn Val Phe Ala He Thr Leu Asp Arg Tyr Asp He Ser 
5 130 135 140 

Val Lys Pro Ala Asn Arg He Leu Thr Met Gly Arg Ala Val Met Leu 
145 150 155 160 

Met He Ser He Trp He Phe Ser Phe Phe Ser Phe Leu He Pro Phe 
165 170 175 

10 He Glu Val Asn Phe Phe Ser Leu Gin Ser Gly Asn Thr Trp Glu Asn 

180 185 190 

Lys Thr Leu Leu Cys Val Ser Thr Asn Glu Tyr Tyr Thr Glu Leu Gly 
195 200 205 

Met Tyr Tyr His Leu Leu Val Gin He Pro He Phe Phe Phe Thr Val 
15 210 215 220 

Val Val Met Leu He Thr Tyr Thr Lys He Leu Gin Ala Leu Asn He 
225 230 235 240 

Arg He Gly Thr Arg Phe Ser Thr Gly Gin Lys Lys Lys Ala Arg Lys 
245 250 255 

20 Lys Lys Thr He Ser Leu Thr Thr Gin His Glu Ala Thr Asp Met Ser 

260 265 270 

Gin Ser Ser Gly Gly Arg Asn Val Val Phe Gly Val Arg Thr Ser Val 
275 280 285 

Ser Val He He Ala Leu Arg Arg Ala Val Lys Arg His Arg Glu Arg 
25 290 295 300 

Arg Glu Arg Gin Lys Arg Val Lys Arg Met Ser Leu Leu He He Ser 
305 310 315 320 

Thr Phe Leu Leu Cys Trp Thr Pro He Ser Val Leu Asn Thr Thr He 
325 330 335 

30 Leu Cys Leu Gly Pro Ser Asp Leu Leu Val Lys Leu Arg Leu Cys Phe 

340 345 350 

Leu Val Met Ala Tyr Gly Thr Thr He Phe His Pro Leu Leu Tyr Ala 
355 360 365 

Phe Thr Arg Gin Lys Phe Gin Lys Val Leu Lys Ser Lys Met Lys Lys 
35 370 375 380 



Arg Val Val Ser He Val Glu Ala Asp Pro Leu Pro Asn Asn Ala Val 
385 390 395 400 
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He His TVsn Ser Trp He Asp Pro Lys Arg Asn Lys Lys He Thr Phe 

405 410 415 

Glii Asp Ser Glu He Arg Glu Lys Arg Leu Val Pro Gin Val Val Thr 
420 425 430 

Asp 

(192) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1209 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

ATGTTGTGTC CTTCCAAGAC AGATGGCTCA GGGCACTCTG GTAGGATTCA CCAGGAAACT 60 

CATGGAGAAG GGAAAAGGGA CAAGATTAGC AACAGTGAAG GGAGGGAGAA TGGTGGGAGA 120 

GGATTCCAGA TGAACGGTGG GTCGCTGGAG GCTGAGCATG CCAGCAGGAT GTCAGTTCTC 180 

AGAGCAAAGC CCATGTCAAA CAGCCAACGC TTGCTCCTTC TGTCCCCAGG ATCACCTCCT 240 

CGCACGGGGA GCATCTCCTA CATCAACATC ATCATGCCTT CGGTGTTCGG CACCATCTGC 300 

CTCCTGGGCA TCATCGGGAA CTCCACGGTC ATCTTCGCGG TCGTGAAGAA GTCCAAGCTG 360 

CACTGGTGCA ACAACGTCCC CGACATCTTC ATCATCAACC TCTCGGTAGT AGATCTCCTC 420 

TTTCTCCTGG GCATGCCCTT CATGATCCAC CAGCTCATGG GCAATGGGGT GTGGCACTTT 480 

GGGGAGACCA TGTGCACCCT CATCACGGCC ATGGATGCCA ATAGTCAGTT CACCAGCACC 540 

TACATCCTGA CCGCCATGGC CATTGACCGC TACCTGGCCA CTGTCCACCC CATCTCTTCC 600 

ACGAAGTTCC GGAAGCCCTC TGTGGCCACC CTGGTGATCT GCCTCCTGTG GGCCCTCTCC 660 

TTCATCAGCA TCACCCCTGT GTGGCTGTAT GCCAGACTCA TCCCCTTCCC AGGAGGTGCA 720 

GTGGGCTGCG GCATACGCCT GCCCAACCCA GACACTGACC TCTACTGGTT CACCCTGTAC 780 

CAGTTTTTCC TGGCCTTTGC CCTGCCTTTT GTGGTCATCA CAGCCGCATA CGTGAGGATC 840 

CTGCAGCGCA TGACGTCCTC AGTGGCCCCC GCCTCCCAGC GCAGCATCCG GCTGCGGACA 900 

AAGAGGGTGA AACGCACAGC CATCGCCATC TGTCTGGTCT TCTTTGTGTG CTGGGCACCC 960 

TACTATGTGC TACAGCTGAC CCAGTTGTCC ATCAGCCGCC CGACCCTCAC CTTTGTCTAC 1020 

TTATACAATG CGGCCATCAG CTTGGGCTAT GCCAACAGCT GCCTCAACCC CTTTGTGTAC 1080 
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ATCGTGCTCT GTGAGACGTT CCGCAAACGC TTGGTCCTGT CGGTGAAGCC TGCAGCCCAG 1140 
GGGCAGCTTC GCGCTGTCAG CAACGCTCAG ACGGCTGACG AGGAGAGGAC AGAAAGCAAA 1200 
GGCACCTGA 1209 
(193) INFORMATION FOR SEQ ID NO: 192: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 402 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

10 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

Met Leu Cys Pro Ser Lys Thr Asp Gly Ser Gly His Ser Gly Arg He 
15 10 15 

His Gin Glu Thr His Gly Glu Gly Lys Arg Asp Lys He Ser Asn Ser 
15 20 25 30 

Glu Gly Arg Glu Asn Gly Gly Arg Gly Phe Gin Met Asn Gly Gly Ser 
35 40 45 

Leu Glu Ala Glu His Ala Ser Arg Met Ser Val Leu Arg Ala Lys Pro 
50 55 60 

20 Met Ser Asn Ser Gin Arg Leu Leu Leu Leu Ser Pro Gly Ser Pro Pro 

65 70 75 80 

Arg Thr Gly Ser He Ser Tyr He Asn He He Met Pro Ser Val Phe 
85 90 95 

Gly Thr He Cys Leu Leu Gly He He Gly Asn Ser Thr Val He Phe 
25 100 105 110 

Ala Val Val Lys Lys Ser Lys Leu His Trp Cys Asn Asn Val Pro Asp 
H5 120 125 

He Phe He He Asn Leu Ser Val Val Asp Leu Leu Phe Leu Leu Gly 
130 135 140 

30 Met Pro Phe Met He His Gin Leu Met Gly Asn Gly Val Trp His Phe 

145 150 155 160 

Gly Glu Thr Met Cys Thr Leu He Thr Ala Met Asp Ala Asn Ser Gin 
165 170 175 

Phe Thr Ser Thr Tyr He Leu Thr Ala Met Ala He Asp Arg Tyr Leu 
35 180 185 190 

Ala Thr Val His Pro He Ser Ser Thr Lys Phe Arg Lys Pro Ser Val 



10 
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"5 200 205 

Ala Thr Leu Val He Cys Leu Leu Trp Ala Leu Ser Phe He Ser He 
210 215 220 

Thr Pro val Trp Leu Tyr Ala Arg Leu He Pro Phe Pro Gly Gly Ala 

230 235 240 

Val Gly Cys Gly He Arg Leu Pro Asn Pro Asp Thr Asp Leu Tyr Trp 
245 250 255 

Phe Thr Leu Tyr Gin Phe Phe Leu Ala Phe Ala Leu Pro Phe Val Val 
260 265 270 

He Thr Ala Ala Tyr Val Arg He Leu Gin Arg Met Thr Ser Ser Val 
275 280 285 

Ala Pro Ala Ser Gin Arg Ser He Arg Leu Arg Thr Lys Arg val Lys 

295 300 ^ 

Arg Thr Ala He Ala He Cys Leu Val Phe Phe Val cys Trp Ala Pro 

310 315 

Tyr Tyr Val Leu Gin Leu Thr Gin Leu Ser He Ser Arg Pro Thr Leu 
325 330 335 

Thr Phe Val Tyr Leu Tyr Asn Ala Ala He Ser Leu Gly Tyr Ala Asn 
340 345 

ser cys Leu Asn Pro Phe Val Tyr He Val Leu Cys Glu Thr Phe Arg 
355 360 365 

Lys Arg Leu Val Leu Ser Val Lys Pro Ala Ala Gin Gly Gin Leu Ara 
370 375 380 

Ala Val Ser Asn Ala Gin Thr Ala Asp Glu Glu Arg Thr Glu Ser Lys 



9 



> 385 



390 



395 400 
Gly Thr 

(194) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

* (A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 
ATGGATGTGA CTTCCCAAGC CCGGGGCGTG GGCCTGGAGA TGTACCCAGG CACCGCGCAC 60 
GCTGC6GCCC CCAACACCAC CTCCCCCGA6 CTCAACCTGT CCCACCCGCT CCTGGGCACC 120 
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GCCCTGGCCA ATGGGACAGG TGAGCTCTCG GAGCACCAGC AGTACGTGAT CGGCCTGTTC 180 

CTCTCGTGCC TCTACACCAT CTTCCTCTTC CCCATCGGCT TTGTGGGCAA CATCCTGATC 240 

CTGGTGGTGA ACATCAGCTT CCGCGAGAAG ATGACCATCC CCGACCTGTA CTTCATCAAC 300 

CTGGCGGTGG CGGACCTCAT CCTGGTGGCC GACTCCCTCA TTGAGGTGTT CAACCTGCAC 360 

GAGCGGTACT ACGACATCGC CGTCCTGTGC ACCTTCATGT CGCTCTTCCT GCAGGTCAAC 420 

ATGTACAGCA GCGTCTTCTT CCTCACCTGG ATGAGCTTCG ACCGCTACAT CGCCCTGGCC 480 

AGGGCCATGC GCTGCAGCCT GTTCCGCACC AAGCACCACG CCCGGCTGAG CTGTGGCCTC 540 

ATCTGGATGG CATCCGTGTC AGCCACGCTG GTGCCCTTCA CCGCCGTGCA CCTGCAGCAC 600 

ACCGACGAGG CCTGCTTCTG TTTCGCGGAT GTCCGGGAGG TGCAGTGGCT CGAGGTCACG 660 

CTGGGCTTCA TCGTGCCCTT CGCCATCATC GGCCTGTGCT ACTCCCTCAT TGTCCGGGTG 720 

CTGGTCAGGG CGCACCGGCA CCGTGGGCTG CGGCCCCGGC GGCAGAAGGC GAAACGCATG 780 

ATCCTCGCGG TGGTGCTGGT CTTCTTCGTC TGCTGGCTGC CGGAGAACGT CTTGATCAGC 840 

GTGCACCTCC TGCAGCGGAC GCAGCCTGGG GCCGCTCCCT GCAAGCAGTC TTTCCGCCAT 900 

GCCCACCCCC TCACGGGCCA CATTGTCAAC CTCGCCGCCT TCTCCAACAG CTGCCTAAAC 960 

CCCCTCATCT ACAGCTTTCT CGGGGAGACC TTCAGGGACA AGCTGAGGCT GTACATTGAG 1020 

CAGAAAACAA ATTTGCCGGC CCTGAACCGC TTCTGTCACG CTGCCCTGAA GGCCGTCATT 1080 

CCAGACAGCA CCGAGCAGTC GGATGTGAGG TTCAGCAGTG CCGTGTGA 1128 
(195) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

Met Asp Val Thr Ser Gin Ala Arg Gly Val Gly Leu Glu Met Tyr Pro 
IS 10 15 

Gly Thr Ala His Ala Ala Ala Pro Asn Thr Thr Ser Pro Glu Leu Asn 
20 25 30 

Leu Ser His Pro Leu Leu Gly Thr Ala Leu Ala Asn Gly Thr Gly Glu 
35 40 45 
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Leu Ser Glu His Gin Gin Tyr Val He Gly Leu Phe Leu Ser Cys Leu 
50 55 60 

Tyr Thr He Phe Leu Phe Pro He Gly Phe Val Gly Asn He Leu He 
^5 70 75 80 

Leu Val Val Asn He Ser Phe Arg Glu Lys Met Thr He Pro Asp Leu 
85 90 95 

Tyr Phe He Asn Leu Ala Val Ala Asp Leu He Leu Val Ala Asp Ser 
100 105 110 

Leu He Glu Val Phe Asn Leu His Glu Arg Tyr Tyr Asp He Ala Val 
H5 120 125 

Leu Cys Thr Phe Met Ser Leu Phe Leu Gin Val Asn Met Tyr Ser Ser 
130 135 140 

Val Phe Phe Leu Thr Trp Met Ser Phe Asp Arg Tyr He Ala Leu Ala 
145 150 155 160 

Arg Ala Met Arg Cys Ser Leu Phe Arg Thr Lys. His His Ala Arg Leu 
165 170 175 

Ser Cys Gly Leu He Trp Met Ala Ser Val Ser Ala Thr Leu Val Pro 
180 185 190 

Phe Thr Ala Val His Leu Gin His Thr Asp Glu Ala Cys Phe Cys Phe 
195 200 205 

Ala Asp Val Arg Glu Val Gin Trp Leu Glu Val Thr Leu Gly Phe He 
210 215 220 

Val Pro Phe Ala He He Gly Leu Cys Tyr Ser Leu He Val Arg Val 
225 230 235 240 

Leu Val Arg Ala His Arg His Arg Gly Leu Arg Pro Arg Arg Gin Lys 
245 250 255 

Ala Lys Arg Met He Leu Ala Val Val Leu Val Phe Phe Val Cys Trp 
260 265 270 

Leu Pro Glu Asn Val Phe He Ser Val His Leu Leu Gin Arg Thr Gin 
275 280 285 

Pro Gly Ala Ala Pro Cys Lys Gin Ser Phe Arg His Ala His Pro Leu 
290 295 300 

Thr Gly His He Val Asn Leu Ala Ala Phe Ser Asn Ser Cys Leu Asn 
305 310 315 320 

Pro Leu He Tyr Ser Phe Leu Gly Glu Thr Phe Arg Asp Lys Leu Arg 
325 330 335 



Leu Tyr He Glu Gin Lys Thr Asn Leu Pro Ala Leu Asn Arg Phe Cys 
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340 345 350 

His Ala Ala Leu Lys Ala Val lie Pro Asp Ser Thr Glu Gin Ser Asp 
355 360 365 

Val Arg Phe Ser Ser Ala Val 
370 375 

(196) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 960 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 

ATGCCATTCC CAAACTGCTC AGCCCCCAGC ACTGTGGTGG CCACAGCTGT GGGTGTCTTG 60 

CTGGGGCTGG AGTGTGGGCT GGGTCTGCTG GGCAACGCGG TGGCGCTGTG GACCTTCCTG 120 

TTCCGGGTCA GGGTGTGGAA GCCGTACGCT GTCTACCTGC TCAACCTGGC CCTGGCTGAC 180 

CTGCTGTTGG CTGCGTGCCT GCCTTTCCTG GCCGCCTTCT ACCTGAGCCT CCAGGCTTGG 240 

CATCTGGGCC GTGTGGGCTG CTGGGCCCTG CGCTTCCTGC TGGACCTCAG CCGCAGCGTG 300 

GGGATGGCCT TCCTGGCCGC CGTGGCTTTG GACCGGTACC TCCGTGTGGT CCACCCTCGG 360 

CTTAAGGTCA ACCTGCTGTC TCCTCAGGCG GCCCTGGGGG TCTCGGGCCT CGTCTGGCTC 420 

CTGATGGTCG CCCTCACCTG CCCGGGCTTG CTCATCTCTG AGGCCGCCCA GAACTCCACC 480 

AGGTGCCACA GTTTCTACTC CAGGGCAGAC GGCTCCTTCA GCATCATCTG GCAGGAAGCA 540 

CTCTCCTGCC TTCAGTTTGT CCTCCCCTTT GGCCTCATCG TGTTCTGCAA TGCAGGCATC 600 

ATCAGGGCTC TCCAGAAAAG ACTCCGGGAG CCTGAGAAAC AGCCCAAGCT TCAGCGGGCC 660 

AAGGCACTGG TCACCTTGGT GGTGGTGCTG TTTGCTCTGT GCTTTCTGCC CTGCTTCCTG 720 

GCCAGAGTCC TGATGCACAT CTTCCAGAAT CTGGGGAGCT GCAGGGCCCT TTGTGCAGTG 780 

GCTCATACCT CGGATGTCAC G6GCAGCCTC ACCTACCTGC ACAGTGTCGT CAACCCCGTG 840 

GTATACTGCT TCTCCAGCCC CACCTTCAGG AGCTCCTATC GGAGGGTCTT CCACACCCTC 900 
CGAGGCAAAG GGCAGGCAGC AGAGCCCCCA GATTTC/^CC CCAGAGACTC CTATTCCTGA > 960 

(197) INFORMATION FOR SEQ ID NO: 196: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 3X9 amino acids 

(B) TYPE: amino acid 
IC) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECDLE TYPE: protein 
(Xi) SEQTJEMCE DESCRIPTION: SEQ ID NO: 196: 
Met Pro Phe Pro Asn Cys Ser Ala Pro Ser Thr Val Val Ala Thr Ala 

val Gly val Leu Leu Gly Leu Glu Cys Gly Leu Gly Leu Leu Gly Asn 
2° 25 30 

Ala Val Ala Leu Trp Thr Phe Leu Phe Arg Val Arg Val Trp Lys Pro 
35 40 45 

Tyr Ala Val Tyr Leu Leu Asn Leu Ala Leu Ala Asp Leu Leu Leu Ala 

55 60 

Ala cys Leu Pro Phe Leu Ala Ala Phe Tyr Leu Ser Leu Gin Ala Trp 
" ^0 75 80 



His Leu Gly Arg Val Gly Cys Trp Ala Leu Arg Phe Leu Leu Asp Leu 
• 85 90 35 

ser Arg Ser Val Gly Met Ala Phe Leu Ala Ala Val Ala Leu Asp Arg 
100 105 110 

ryr Leu Arg Val Val His Pro Arg Leu Lys Val Asn Leu Leu Ser Pro 
lis 120 125 

Gin Ala Ala Leu Gly Val Ser Gly Leu Val Trp Leu Leu Met Val Ala 
130 135 140 

Leu Thr Cys Pro Gly Leu Leu He Ser Glu Ala Ala Gin Asn Ser Thr 

"° 155 160 

Arg Cys His Ser Phe Tyr Ser Arg Ala Asp Gly Ser Phe Ser He He 
165 170 175 

Trp Gin Glu Ala Leu Ser Cys Leu Gin Phe Val Leu Pro Phe Gly Leu 
180 185 190 

He Val Phe Cys Asn Ala Gly He He Arg Ala Leu Gin Lys Arg Leu 
195 200 205 



Arg Glu Pro Glu Lys Gin Pro Lys Leu Gin Arg Ala Lys Ala Leu 

215 220 



Val 



Thr Leu val Val Val Leu Phe Ala Leu Cys Phe Leu Pro Cys Phe Leu 

"° 235 240 

Ala Arg Val Leu Met His ile Phe Gin Asn Leu Gly Ser Cys Arg Ala 
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245 250 255 

Leu Cys Ala Val Ala His Thr Ser Asp Val Thr Gly Ser Leu Thr Tyr 
260 265 270 

Leu His Ser Val Val Asn Pro Val Val Tyr Cys Phe Ser Ser Pro Thr 
275 280 285 

Phe Arg Ser Ser Tyr Arg Arg Val Phe His Thr Leu Arg Gly Lys Gly 
290 295 . 300 

Gin Ala Ala Glu Pro Pro Asp Phe Asn Pro Arg Asp Ser Tyr Ser 
305 310 315 

(198) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1143 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

ATGGAGGAAG GTGGTGATTT TGACAACTAC TATGGGGCAG ACAACCAGTC TGAGTGTGAG 60 

TACACAGACT GGAAATCCTC GGGGGCCCTC ATCCCTGCCA TCTACATGTT GGTCTTCCTC 120 

CTGGGCACCA CGGGAAACGG TCTGGTGCTC TGGACCGTGT TTCGGAGCAG CCGGGAGAAG 180 

AGGCGCTCAG CTGATATCTT CATTGCTAGC CTGGCGGTGG CTGACCTGAC CTTCGTGGTG 240 

ACGCTGCCCC TGTGGGCTAC CTACACGTAC CGGGACTATG ACTGGCCCTT TGGGACCTTC 300 

TTCTGCAAGC TCAGCAGCTA CCTCATCTTC GTCAACATGT ACGCCAGCGT CTTCTGCCTC 360 

ACCGGCCTCA GCTTCGACCG CTACCTGGCC ATCGTGAGGC CAGTGGCCAA TGCTCGGCTG 420 

AGGCTGCGGG TCAGCGGGGC CGTGGCCACG GCAGTTCTTT GGGTGCTGGC CGCCCTCCTG 480 

GCCATGCCTG TCATGGTGTT ACGCACCACC GGGGACTTGG AGAACACCAC TAAGGTGCAG 540 

TGCTACATGG ACTACTCCAT GGTGGCCACT GTGAGCTCAG AGTGGGCCTG GGAGGTGGGC 600 

CTTGGGGTCT CGTCCACCAC CGTGGGCTTT GTGGTGCCCT TCACCATCAT GCTGACCTGT 660 

TACTTCTTCA TCGCCCAAAC CATCGCTGGC CACTTCCGCA AGGAACGCAT CGAGGGCCTG 720 

CGGAAGCGGC GCCGGCTTAA GAGCATCATC GTGGTGCTGG TGGTGACCTT TGCCCTGTGC 780 

TGGATGCCCT ACCACCTGGT GAAGACGCTG TACATGCTGG GCAGCCTGCT GCACTGGCCC 840 

TGTGACTTTG ACCTCTTCCT CATGAACATC TTCCCCTACT GCACCTGCAT CAGCTACGTC 900 
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AACAGCTGCC TCAACCCCTT CCTCTATGCC TTTTTCGACC CCCGCTTCCG CCAGGCCTGC 960 
ACCTCCATGC TCTGCTGTGG CCAGAGCAGG TGCGCAGGCA CCTCCCACAG CAGCAGTCGG 1020 
GAGAAGTCAG CCAGCTACTC TTCGGGGCAC AGCCAGGGGC CCGGCCCCAA CATGGGCAAG 1080 
GGTGGAGAAC AGATGCACGA GAAATCCATC CCCTACAGCC AGGAGACCCT TGTGGTTGAC 1140 
TAG 

1143 

(199) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 380 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(li) MOLECULE TYPE: protein 

(xi) SEQtJENCE DESCRIPTION: SEQ ID NO: 198: 

Met Glu Glu Gly Gly Asp Phe Asp Asn Tyr Tyr Gly Ala Asp Asn Gin 
^ 5 10 15 

Ser Glu Cys Glu Tyr Thr Asp Trp Lys Ser Ser Gly Ala Leu lie Pro 
20 25 30 

Ala He Tyr Met Leu Val Phe Leu Leu Gly Thr Thr Gly Asn Gly Leu 
35 40 45 

Val Leu Trp Thr Val Phe Arg Ser Ser Arg Glu Lys Arg Arg Ser Ala 
50 55 60 

Asp He Phe He Ala Ser Leu Ala Val Ala Asp Leu Thr Phe Val Val 
"70 75 . 80 

Thr Leu Pro Leu Trp Ala Thr Tyr Thr Tyr Arg Asp Tyr Asp Trp Pro 
85 90 95^ 

Phe Gly Thr Phe Phe Cys Lys Leu Ser Ser Tyr Leu He Phe Val Asn 
100 105 110 

Met Tyr Ala Ser Val Phe Cys Leu Thr Gly Leu Ser Phe Asp Arg Tyr 
1" 120 125 

Leu Ala He Val Arg Pro Val Ala Asn Ala Arg Leu Arg Leu Arg Val 
130 135 140 

Ser Gly Ala Val Ala Thr Ala Val Leu Trp Val Leu Ala Ala Leu Leu 

150 155 

Ala Met Pro Val Met Val Leu Arg Thr Thr Gly Asp Leu Glu Asn Thr 
165 170 3^75 
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Thr Lys Val Gin Cys Tyr Met Asp Tyr Ser Met Val Ala Thr Val Ser 
180 185 190 

Ser Glu Trp Ala Trp Glu Val Gly Leu Gly Val Ser Ser Thr Thr Val 
195 200 205 

Gly Phe Val Val Pro Phe Thr He Met Leu Thr Cys Tyr Phe Phe He 
210 215 220 

Ala Gin Thr He Ala Gly His Phe Arg Lys Glu Arg He Glu Gly Leu 
225 230 235 240 

Arg Lys Arg Arg Arg Leu Lys Ser He He Val Val Leu Val Val Thr 
245 250 255 

Phe Ala Leu Cys Trp Met Pro Tyr His Leu Val Lys Thr Leu Tyr Met 
260 265 270 

Leu Gly Ser Leu Leu His Trp Pro Cys Asp Phe Asp Leu Phe Leu Met 
275 280 285 

Asn He Phe Pro Tyr Cys Thr Cys He Ser Tyr Val Asn Ser Cys Leu 
290 295 300 

Asn Pro Phe Leu Tyr Ala Phe Phe Asp Pro Arg Phe Arg Gin Ala Cys 
305 310 . 315 320 

Thr Ser Met Leu Cys Cys Gly Gin Ser Arg Cys Ala Gly Thr Ser His 
325 . 330 335 



Ser Ser Ser Gly Glu Lys Ser Ala Ser Tyr Ser Ser Gly His Ser Gin 
340 345 350 

Gly Pro Gly Pro Asn Met Gly Lys Gly Gly Glu Gin Met His Glu Lys 
355 360 365 

Ser He Pro Tyr Ser Gin Glu Thr Leu Val Val Asp 
370 375 380 

(200) INFORMATION FOR SEQ ID NO: 199: 

(i)* SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1119 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(.ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 



ATGAACTACC CGCTAACGCT GGAAATGGAC CTCGAGAACC T6GAGGACCT GTTCTGGGAA 
CTGGACAGAT TGGACAACTA TAACGACACC TCCCTGGTGG AAAATCATCT CTGCCCTGCC 



60 
120 
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ACAGAGGGTC CCCTCATGGC CTCCTTCAAG GCCGTGTTCG TGCCCGTGGC CTACAGCCTC X80 
ATCTTCCTCC TGGGCGTGAT CGGCAA06TC CTGGTCCTGG TCATCCTGGA GCX3GCACCGG 240 
CAGACACGCA GTTCCAC6GA GACCTTCCTG TTCCACCTGG CCGTGGCCGA CCTCCTGCTG 300 
GTCTTCATCT TGCCCTTTGC CGTGGCCGAG GGCTCTGTGG GCTGGGTCCT GGGGACCTTC 360 
CTCTGCAAAA CTGTGATTGC CCTGCACAAA GTCAACTTCT ACTGCAQCAG CCTOCTCCTG 420 
GCCTGCATCG CCGTGGACCG CTACCTGGCC ATTCTCCACG CCGTCCATCC CTACCGCCAC 480 
CGCCGCCTCC TCTCCATCCA CATCACCTGT GGGACCATCT GGCTGGTCGG CTTCCTCCTT 540 
GCCTTGCCAG AGATTCTCTT CGCCAAAGTC AGCCAAGGCC ATCACAAO^ CTCCCTGCCA 600 
C6TTGCACCT TCTCCCAAGA GAACCAAGCA GAAACGCATG CCTGGTTCAC CTCCC6ATTC 660 
CTCTACCATG TQGCQGGATT CCTGCTGCCC ATGCTGGTGA TGGGCTGGTG CTACGTGGGG 720 
GTAGTGCACA GGTTGCGCCA GGCCCAGCGG CGCCCTCAGC GGCAGAAGGC AAAAAGGGTG 780 
GCCATCCTGG TGACAAGCAT CTTCTTCCTC TGCTGGTCAC CCTACCACAT CGTCATCTTC 840 
CTGGACACCC TGGC6AGGCT GAAGGCCGTG GACAATACCT GCAAGCTGAA TGGCTCTCTC 900 
CCC6TGGCCA TCACCATGTG TGAGTTCCTG GGCCTCGCCC ACTGCTGCCT CAACCCCATG 960 
CTCTACACTT TCGCCGGCGT GAAGTTCCGC AGTGACCTGT CGCGGCTCCT GACCAAGCTG 1020 
GGCTGTACCG GCCCTGCCTC CCTGTGCCAG CTCTTCCCTA GCTGGCGCAG GAGCAGTCTC 1080 
TCTGAGTCAG AGAATGCCAC CTCTCTCACC ACGTTCTAG 
(201) INFORMATION FOR SBQ ID N0:200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 372 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 

Met Asn Tyr Pro Leu Thr Leu Glu Met Asp Leu Glu Asn Leu Glu Asp 
= 10 15 

Leu Phe Trp Glu Leu Asp Arg Leu Asp Asn Tyr Asn Asp Thr Ser Leu 
2° 25 30 

Val Glu Asn His Leu Cys Pro Ala Thr Glu Gly Pro Leu Met Ala Ser 
35 40 45 

• Phe Lys Ala Val Phe Val Pro Val Ala Tyr Ser Leu He Phe Leu Leu 



wo 00/22129 



PCT/US99/23938 



155 



50 



55 



60 



Gly Val He Gly Asn Val Leu Val Leu Val He Leu Glu Arg His Arg 
^5 70 75 ^ 80 

Gin Thr Arg Ser Ser Thr Glu Thr Phe Leu Phe His Leu Ala Val Ala 
85 90 95 

Asp Leu Leu Leu Val Phe He Leu Pro Phe Ala Val Ala Glu Gly Ser 
100 105 110 



10 



Val Gly Trp Val Leu Gly Thr Phe Leu Cys Lys Thr Val He Ala Leu 
115 120 125 

His Lys Val Asn Phe Tyr Cys Ser Ser Leu Leu Leu Ala Cys He Ala 
130 135 140 



Val Asp Arg Tyr Leu Ala He Val His Ala Val His Ala Tyr Arg His 
"5 150 155 160 



15 



Arg Arg Leu Leu Ser He His He Thr Cys Gly Thr He Trp Leu Val 
165 170 175 



Gly Phe Leu Leu Ala Leu Pro Glu He Leu Phe Ala Lys Val Ser Gin 
180 185 190 



20 



Gly His His Asn Asn Ser Leu Pro Arg Cys Thr Phe Ser Gin Glu Asn 
195 200 205 

Gin Ala Glu Thr His Ala Trp Phe Thr Ser Arg Phe Leu Tyr His Val 
210 215 220 



25 



Ala Gly Phe Leu Leu Pro Met Leu Val Met Gly Trp Cys Tyr Val Gly 
225 230 235 240 

Val Val His Arg Leu Arg Gin Ala Gin Arg Arg Pro Gin Arg Gin Lys 
245 250 255 

Ala Lys Arg Val Ala He Leu Val Thr Ser He Phe Phe Leu Cys Trp 
260 265 270 



30 



Ser Pro Tyr His He Val He Phe Leu Asp Thr Leu Ala Arg Leu Lys 
275 280 285 

Ala Val Asp Asn Thr Cys Lys Leu Asn Gly Ser Leu Pro Val Ala He 
290 295 300 



Thr Met Cys Glu Phe Leu Gly Leu Ala His Cys Cys Leu Asn Pro Met 
305 310 315 320 



35 



Leu Tyr Thr Phe Ala Gly Val Lys Phe Arg Ser Asp Leu Ser Arg Leu 
325 330 335 



Leu Thr Lys Leu Gly Cys Thr Gly Pro Ala Ser Leu Cys Gin Leu Phe 
340 . 345 350 
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Pro ser Trp Arg Arg Ser Ser Leu Ser Glu Ser Glu Asn Ala Thr Ser 

360 

Leu Thr Thr Phe 
370 

5 (202) INFORMATION FOR SEQ ID N0:201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



10 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTKMI: SEQ ID NO:201: 

ATGGATGTGA CTTCCCAAGC CCGGGGCGTG GGCCTGGAGA TGTACCCAGG CACCGCGCAG 60 

CCTGCGGCCC CCAACACCAC CTCCCCCGAG CTCAACCTGT CCCACCCGCT CCTGGGCACC 120 

15 GCCCTGGCCA ATGGGACAGG TGAGCTCTCG GAGCACCAGC AGTACGTGAT CGGCCTGTTC 180 

CTCTCGTGCC TCTACACCAT CTTCCTCTTC CCCATCGGCT TTOTGGGCAA CATCCTGATC 240 

CTGGTGGTGA ACATCAGCTT CCGCGAGAAG AT6ACCATCC CCGACCTGTA CTTCATCAAC 300 

CTGGCGGTGG CGGACCTCAT CCTGGTGGCC GACTCCCTCA TTGAGGTGTT CAACCTGCAC 360 

GAGCGGTACT ACGACATCGC CGTCCTGTGC ACCTTCATGT CGCTCTTCCT GCAGGTCAAC 420 

20 ATGTACAGCA GCGTCTTCTT CCTCACCTGG ATGAGCTTCG ACCGCTACAT CGCCCTGGCC 480 

• AGGGCCATGC GCTGCAGCCT GTTCCGCACC AAGCACCACG CCCGGCTGAG CTGTGGCCTC 540 

ATCTGGATGG CATCCGTGTC AGCCACGCTG GTGCCCTTCA CCGCCGIGCA CCTGCAGCAC 600 

ACCGACGAGG CCTGCTTCTG TTTCGCGGAT GTCCGGGAGG TGCAG-TOGCT CGAGGTCACG 660 

CWSGGCrrCA TCGTGCCCTT CGCCATCATC GGCCTCTGCT ACTCCCTCAT TGTCCGGGTG 720 

25 CTGGTCAGGG CGCACCGGCA CCGTGGGCTG CGGCCCCGGC GGCAGAAGGC GAAGCGCATG 780 

ATCCTCGCGG TGGTGCTGGT CTTCTTCGTC TGCTGGCTGC CGGAGAACGT CTTCATCAGC 840 

GTGCACCTCC TGCAGCGGAC GCAGCCTGGG GCCGCTCCCT GCAAGCAGTC TTTCCGCCAT 900 

GCCCACCCCC TCAC6GGCCA CATTGTCAAC CTCACCGCCT TCTCCAACAG CTGCCTAAAC 960 

CCCCTCATCT ACAGCTTTCT CGGGGAGACC TTCAGGGACA AGCTGAGGCT GTACATTGAG 1020 

30 CAGAAAACAA ATTTGCCGGC CCTGAACCGC TTCTGTCACG CTGCCCTGAA GGCCGTCATT 1080 

CCAGACAGCA CCQAGCA6TC GGATQTGAGG TTCAGCAGTG CCGTGTAG 1128 
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(203) INFORMATION FOR SEQ ID NO:202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 

Met Asp Val Thr Ser Gin Ala Arg Gly Val Gly Leu Glu Met Tyr Pro 
1 5 10 15 

^ Gly Thr Ala Gin Pro Ala Ala Pro Asn Thr Thr Ser Pro Glu Leu Asn 
20 25 30 

Leu Ser His Pro Leu Leu Gly Thr Ala Leu Ala Asn Gly Thr Gly Glu 
35 40 45 

Leu Ser Glu His Gin Gin Tyr Val lie Gly Leu Phe Leu Ser Cys Leu 
50 55 60 

Tyr Thr lie Phe Leu Phe Pro He Gly Phe Val Gly Asn He Leu He 
65 70 75 80 

Leu Val Val Asn He Ser Phe Arg Glu Lys Met Thr He Pro Asp Leu 
85 90 95 

Tyr Phe He Asn Leu Ala Val Ala Asp Leu He Leu Val Ala Asp Ser 
100 105 110 

Leu He Glu Val Phe Asn Leu His Glu Arg Tyr Tyr Asp He Ala Val 
115 120 125 

Leu Cys Thr Phe Met Ser Leu Phe Leu Gin Val Asn Met Tyr Ser Ser 
130 135 140 

Val Phe Phe Leu Thr Trp Met Ser Phe Asp Arg Tyr He Ala Leu Ala 
145 150 155 160 

Arg Ala Met Arg Cys Ser Leu Phe Arg Thr Lys His His Ala Arg Leu 
165 170 175 

Ser Cys Gly Leu He Trp Met Ala Ser Val Ser Ala Thr Leu Val Pro 
180 185 190 

Phe Thr Ala Val His Leu Gin His Thr Asp Glu Ala Cys Phe Cys Phe 
195 200 205 

Ala Asp Val Arg Glu Val Gin Trp Leu Glu Val Thr Leu Gly Phe He 
210 215 220 



Val Pro Phe Ala He He Gly Leu Cys Tyr Ser Leu He Val Arg Val 



10 



15 
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. 230 235 240 

I^u Val Arg Ala His Arg His Arg Gly Leti Arg Pro Arg Arg Gin Lys 
• 250 255 

Ala Lys Arg Met He Leu Ala Val Val Leu Val Phe Phe Val Cys Trp 
260 265 270 

Leu Pro Glu Asn Val Phe He Ser Val His Leu Leu Gin Arg Thr Gin 
275 280 285 

Pro Gly Ala Ala Pro Cys Lys Gin Ser Phe Arg His Ala His Pro Leu 
290 295 

Thr Gly His lie Val Asn Leu Thr Ala Phe Ser Asn Ser Cys Leu Asn 

310 

Pro Leu He Tyr Ser Phe Leu Gly Glu Thr Phe Arg Asp Lys Leu Arg 
325 330 

Leu Tyr He Glu Gin Lys Thr Asn Leu Pro Ala Leu Asn Arg Phe Cys 
340 350 

His Ala Ala Leu Lys Ala Val He Pro Asp Ser Thr Glu Gin Ser Asp 
355 360 365 

Val Arg Phe Ser Ser Ala Val 
370 375 

20 (204) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1137 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESiS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: 

ATGGACCTGG GGAAACCAAT GAAAAGCGTG CTGGTGGTGG CTCTCCTTGT CATTTTCCAG 60 

GTATGCCT6T GTCAAGAT6A GGTCACGGAC GATTACATCG GAGACAACAC CACAGTGGAC 120 

30 TACACTTTGT TCGA6TCTTT GTGCTCCAAG AAGGACGTGC GGAACTTTAA AGCCTGGTTC 180 

CTCCCTATCA TGTACTCCAT CATTTGITTC GTCGGCCTAC TGGGCAATG6 GCTGGTCGTG 240 

TTGACCTATA TCTATTTCAA GAGGCTCAAQ ACCATOACCG ATACCTACCT GCTCAACCTG 300 

GCGGTGGCAG ACATCCTCTT CCTCCTGACC CTTCCCTTCT GGGCCTACAG CGCGGCCAAG 360 

TCCT6GGTCT TCGGTGTCCA CTTTTGCAAG CTCATCTTTG CCATCTACAA GATGAGCTTC 420 
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TTCAGTGGCA TGCTCCTACT TCTTTGCATC AGCATTGACC 6CTACGTGGC CATCGTCCAG 480 . 

GCTGTCTCAG CTCACCGCCA CCGTGCCCGC GTCCTTCTCA TCAGCAAGCT GTCCTGTGTQ 540 

GGCATCTGGA TACTAGCCAC AGTGCTCTCC ATCCCAGAGC TCCTGTACAG TGACCTCCAG 600 

AGGAGCAGCA GTGAGCAAGC GATGCGATGC TCTCTCATCA CAGAGCATGT GGAGGCCTTT 660 

ATCACCATCC AGGTGGCCCA GATGGTGATC GGCTTTCTGG TCCCCCTGCT GGCCATGAGC 720 

TTCTGTTACC TTGTCATCAT CCGCACCCTG CTCCAGGCAC GCAACTTTGA GCGCAACAAG 780 

GCCAAAAAGG TGATCATCGC TGTGGTCGTG GTCTTCATAG TCTTCCAGCT GCCCTACAAT 840 

GGGGTGGTCC TGGCCCAGAC GGTGGCCAAC TTCAACATCA CCAGTAGCAC CTGTGAGCTC 900 

AGTAAGCAAC TCAACATCGC CTACGACGTC ACCTACAGCC TGGCCTGCGT CCGCTGCTGC 960 

GTCAACCCTT TCTTGTACGC CTTCTITCGGC GTCAAGTTCC GCAACGATCT CTTCAAGCTC 1020 

TTCAAGGACC TGGGCTGCCT CAGCCAGGAG CAGCTCCGGC AGTGGTCTTC CTGTCGGCAC 1080 

ATCCGGCGCT CCTCCATGAG TGTGGAGGCC GAGACCACCA CCACCTTCTC CCCATAG 1137 
(205) INFORMATION FOR SEQ ID NO:204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:. 378 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDMESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 

Met Asp Leu Gly Lys Pro Met Lys Ser Val Leu Val Val Ala Leu Leu 
1 5 " 10 15 

Val lie Phe Gin Val Cys Leu Cys Gin Asp Glu Val Thr Asp Asp Tyr 
20 25 30 

lie Gly Asp Asn Thr Thr Val Asp Tyr Thr Leu Phe Glu Ser Leu Cys 
35 40 45 

Ser Lys Lys Asp Val Arg Asn Phe Lys Ala Trp Phe Leu Pro lie Met 
50 55 60 

Tyr Ser lie lie Cys Phe Val Gly Leu Leu Gly Asn Gly Leu Val Val 
65 70 75 80 

Leu Thr Tyr lie Tyr Phe Lys Arg Leu Lys Thr Met Thr Asp Thr Tyr 
85 90 95 

Leu Leu Asn Leu Ala Val Ala Asp lie Leu Phe Leu Leu Thr Leu Pro. 
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"° 105 

Phe Trp Ala Tyr Ser Ala Ala Lys Ser Trp Val Phe Gly Val His Phe 
115 120 



10 



3 



^ Cys Lys Leu He Phe Ala He Tyr Lys Met Ser Phe Phe Ser Qly Met 

140 

Leu Leu Leu Leu Cys lie Ser He Asp Arg .^r Val Ala He Val Gin 

. 155 ISO 

Ala val ser Ala His Arg His Arg Ala Arg Val Leu Leu He Ser Lys 
1S5 170 175 

Leu ser Cys Val Gly He Trp He Leu Ala Thr Val Leu Ser He Pro 

185 290 

Glu Leu Leu Tyr Ser Asp Leu Gin Arg Ser Ser Ser Glu Gin Ala Met 
195 200 205 

15 Sfo '*^" ""^"^ Thr He Gin 

220 

val Ala Gin Met Val He Gly Phe Leu Val Pro Leu Leu Ala Met Ser 

"° , 235 240 

Phe cys Tyr Leu Val He He Arg Thr Leu Leu Gin Ala Arg Asn Phe 
245 250 255 

Glu Arg Asn Lys Ala Lys Lys Val He He Ala Val Val Val Val Phe 

260 



265 



270 



He val Phe Gin Leu Pro Tyr Asn Gly Val Val Leu Ala Gin Thr Val 
275 280 285 



Ala Asti Phe Asn He Thr Ser Ser Thr Cys Glu Leu Ser Lys Gin Leu 

Asn He Ala 'Tyr Asp Val Thr Tyr Ser Leu Ala Cys Val Arg Cys Cys 

320 

val Asn Pro Phe Leu ryr Ala Phe He Gly Val Lys Phe Arg Asn Asp 



330 



335 



Leu Phe Lys Leu Phe Lys Asp Leu Gly cys Leu Ser Gin Glu Gin Leu 

345 350 

Arg Gin Trp Ser Ser Cys Arg His He Arg Arg Ser Ser Met Ser Val 



360 



365 



Glu Ala Glu Thr Thr Thr Thr Phe Ser Pro 
370 

(206) INFORMATION FOR SBQ ID NO:205: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1086 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:205: 

ATGGATATAC AAATGGCAAA CAATTTTACT CCGCCCTCTG CAACTCCTCA GGGAAATGAC 60 

TGTGACCTCT ATGCACATCA CAGCACGGCC AGGATAGTAA TGCCTCTGCA TTACAGCCTC 120 

GTCTTCATCA TTGGGCTCGT GGGAAACTTA CTAGCCTTGG TCGTCATTGT TCAAAACAGG 180 

AAAAAAATCA ACTCTACCAC CCTCTATTCA ACAAATTTGG TGATTTCTGA TATACTTTTT 240 

ACCACGGCTT TGCCTACACG AATAGCCTAC TATGCAATGG GCTTTGACTG GAGAATCGGA 300 

GATGCCTTGT GTAGGATAAC TGCGCTAGTG TTTTACATCA ACACATATGC AGGTGTGAAC 360 

TTTATGACCT GCCTGAGTAT TGACCGCTTC ATTGCTGTGG TGCACCCTCT ACGCTACAAC 420 

AAGATAAAAA GGATTGAACA TGCAAAAGGC GTGTGCATAT TTGTCTGGAT TCTAGTATTT 480 

GCTCAGACAC TCCCACTCCT CATCAACCCT ATGTCAAAGC AGGAGGCTGA AAGGATTACA 540 

TGCATGGAGT ATCCAAACTT TGAAGAAACT AAATCTCTTC CCTGGATTCT GCTTGGGGCA 600 

TGTTTCATAG GATATGTACT TCCACTTATA ATCATTCTCA TCTGCTATTC TCAGATCTGC 660 

TGCAAACTCT TCAGAACTGC CAAACAAAAC CCACTCACTG AGAAATCTGG TGTAAACAAA 720 

AAGGCTAAAA ACACAATTAT TCTTATTATT GTTGTGTTTG TTCTCTGTTT CACACCTTAC 780 

CATGTTGCAA TTATTCAACA TATGATTAAG AAGCTTCGTT TCTCTAATTT CCTGGAATGT 840 

AGCCAAAGAC ATTCGTTCCA GATTTCTCTG CACTTTACAG TATGCCTGAT GAACTTCAAT 900 

TGCTGCATGG ACCCTTTTAT CTACTTCTTT GCATGTAAAG GGTATAAGAG AAAGGTTATG 960 

AGGATGCTGA AACGGCAAGT CAGTGTATCG ATTTCTAGTG CTGTGAAGTC AGCCCCTGAA 1020 

GAAAATTCAC GTGAAATGAC AGAAACGCAG ATGATGATAC ATTCCAAGTC TTCAAATGGA 1080 

AAGTGA 1086 
(207) INFORMATION FOR SEQ ID NO:206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IiENGTH: 361 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 
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(ii) MOLECOLE TYPE: protein 
. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 06: 

Met Asp He Gin Met Ala Asn Asn Phe Thr Pro Pro Ser Ala Thr Pro 

Gin Gly Asn Asp Cys Asp Leu Tyr Ala His His Ser Thr Ala Arg He 
20 25 30 

Val Met Pro Leu His Tyr Ser Leu Val Phe He He Gly Leu Val Gly 
35 40 45 

Asn Leu Leu Ala Leu Val Val He Val Gin Asn Arg Lys Lys He Asn 
50 55 60 

Ser Thr Thr Leu Tyr Ser Thr Asn Leu Val He Ser Asp He Leu Phe 
«5 70 75 80 

Thr Thr Ala Leu Pro Thr Arg He Ala Tyr Tyr Ala Met Gly Phe Asp 
85 90 95 

Trp Arg He Gly Asp Ala Leu Cys Arg He Thr Ala Leu Val Phe Tyr 
100 105 110 

He Asn Thr Tyr Ala Gly Val Asn Phe Met Thr Cys Leu Ser He Asp 
115 120 125 

Arg Phe He Ala Val Val His Pro Leu Arg Tyr Asn Lys He Lys Arg 
130 135 140 

He Glu His Ala Lys Gly Val Cys He Phe Val Trp He Leu Val Phe 
"5 150 155 160 

Ala Gin Thr Leu Pro Leu Leu He Asn Pro Met Ser Lys Gin Glu Ala 
1S5 170 175 

Glu Arg He Thr Cys Met Glu Tyr Pro Asn Phe Glu Glu Thr Lys Ser 
180 185 190 

Leu Pro Trp He Leu Leu Gly Ala Cys Phe He Gly Tyr Val Leu Pro 
195 200 205 

Leu He He He Leu He Cys Tyr Ser Gin He Cys Cys Lys Leu Phe 
210 215 220 

Arg Thr Ala Lys Gin Asn Pro Leu Thr Glu Lys Ser Gly Val Asn Lvs 
225 230 235 240 

Lys Ala Lys Asn Thr He He Leu He He Val Val Phe Val Leu Cys 
245 250 255 

Phe Thr Pro Tyr His Val Ala He He Gin His Met He Lys Lys Leu 
260 265 270 
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Arg Phe Ser Asn Phe Leu Glu Cys Ser Gin Arg His Ser Phe Gin lie 

275 280 285 

Ser Leu His Phe Thr Val Cys Leu Met Asn Phe Asn Cys Cys Met Asp 
290 295 300 

5 Pro Phe He Tyr Phe Phe Ala Cys Lys Gly Tyr Lys Arg Lys Val Met 

305 310 315 320 

Arg Met Leu Lys Arg Gin Val Ser Val Ser He Ser Ser Ala Val Lys 
325 330 335 

Ser Ala Pro Glu Glu Asn Ser Arg Glu Met Thr Glu Thr Gin Met Met 
10 340 345 350 

He His Ser Lys Ser Ser Asn Gly Lys 
355 360 

(208) INFORMATION FOR SEQ ID N0:207: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 1446 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 



ATGCGGTGGC 


TGTGGCCCCT 


GGCTGTCTCT 


CTTGCTGTGA 


TTTTGGCTGT 


GGGGCTAAGC 


60 


AGGGTCTCTG 


GGGGTGCCCC 


CCTGCACCTG 


GGCAGGCACA 


GAGCCGAGAC 


CCAGGAGCAG 


120 


CAGAGCCGAT 


CCAAGAGGGG 


CACCGAGGAT 


GAGGAGGCCA AGGGCGTGCA 


GCAGTATGTG 


180 


CCTGAGGAGT 


GGGCGGAGTA 


CCCCCGGCCC 


ATTCACCCTG 


CTGGCCTGCA 


GCCAACCAAG 


240 


CCCTTGGTGG 


CCACCAGCCC 


TAACCCCGAC 


AAGGATGGGG 


GCACCCCAGA 


CAGTGGGCAG 


300 


GAACTGAGGG 


GCAATCTGAC 


AGGGGCACCA 


GGGCAGAGGC 


TACAGATCCA 


GAACCCCCTG 


360 


TATCCGGTGA 


CCGAGAGCTC 


CTACAGTGCC 


TATGCCATCA 


TGCTTCTGGC 


GCTGGTGGTG 


420 


TTTGCGGTGG 


GCATTGTGGG 


CAACCTGTCG 


GTCATGTGCA 


TCGTGTGGCA 


CAGCTACTAC 


480 


CTGAAGAGCG 


CCTGGAACTC 


CATCCTTGCC 


AGCCTGGCCC 


TCTGGGATTT 


TCTGGTCCTC 


540 


TTTTTCTGCC 


TCCCTATTGT 


CATCTTCAAC 


GAGATCACCA AGCAGAGGCT 


ACT6GGTGAC 


600 


GTTTCTTGTC 


GTGCCGTGCC 


CTTCATGGAG 


GTCTCCTCTC 


TGGGAGTCAC 


GACTTTCAGC 


660 


CTCTGTGCCC 


TGGGCATTGA 


CCGCTTCCAC 


GTGGCCACCA GCACCCTGCC CAAGGTGAGG 


720 


CCCATCGAGC 


GGTGCCAATC 


CATCCTGGCC 


AAGTTGGCTG 


TCATCTGGGT 


GGGCTCCATG 


780 
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ACGCTGGCOX. IX^CCl^GCT CCO^CTG^ CAOCOX^CAC AGOAGCCTOC CCCCACCATO B40 
GGCACCCTGG ACTCATGCAT CATGAAACCC TCAGCCAGCC TGCCCGAGTC CCTGTATTCA 900 
CTGGTGATGA CCTACCAGAA CGCCCGCATG TGGTGGTACT rTGGCTGCTA CTTCTGCCTG 960 
CCCATCCTCT TCACAGTCAC CTGCCAGCTG GT^CAX^c GGGTGCGAGG CCCTCCAGGG 1020 
AGGAAGTCAG AGTGCAGGGC CAGCAAGCAC GAGCAGT^^ AGAGCCAGCT CAAGAGCACC 1080 
G^TGGGCC TGACCGTGGT CTACGCCTTC ..3CACCCTCC CAGAGAACGT C...CAACATC 1140 
GTCGTGGCCT ACCTCTCCAC CGAGCTGACC CGCCAGACCC I^CCTCCT GGGCCTCATC 1200 
AACCAGTTCT CCACC.TCTT CAAGGGCGCC ATCACCCCAG TGCTGCTCCT TTGCATCTGC 1260 
AGGCCGC^ GCCAGGCCTT CC^GACT^ O^CTGC^ GCTGCTGIX^ GGAG1..CGGC 1320 
GGGGCTTCGG AGGCCTCTGC TGCCAATGGG TCGGACAACA AGCTCAAGAC CGAGGTGTCC 1380 
TCTTCCATCT ACTTCCACAA GCCCAGGGAG TCACCCCCAC TCCTGCCCCT GGGCACACCT 1440 
TGCTGA 

1446 

(209) INFORMATION FOR SEQ ID N0:208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 481 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:208: 
Met Arg Trp Leu Trp Pro Leu Ala Val Ser Leu Ala Val He Leu Ala 



10 



val Gly Leu Ser Arg Val Ser Gly Gly Ala Pro Leu His Leu Gly Arg 

25 30 

His Arg Ala Glu Thr Gin Glu Gin Gin Ser Arg Ser Lys Arg Gly Thr 

40 45 

Glu ASP Glu Glu Ala Lys Gly Val Gin Gin Tyr Val Pro Glu Glu Trp 

60 

Ala Glu Tyr Pro Arg Pro He His Pro Ala Gly Leu Gin Pro Thr Lys 

'^^ 80 
Pro Leu Val Ala Thr Ser Pro Asn Pro Asp Lys Asp Gly Gly Thr Pro 
" 90 95 

ASP ser Gly Gin Glu Leu Arg Gly Asn Leu Thr Gly Ala Pro Gly Gin 
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100 105 110 

Arg Leu Gin He Gin Asn Pro Leu Tyr Pro Val Thr Glu Ser Ser Tyr 
115 120 125 

Ser Ala Tyr Ala He Met Leu Leu Ala Leu Val Val Phe Ala Val Gly 
130 135 140 

He Val Gly Asn Leu Ser Val Met Cys He Val Trp His Ser Tyr Tyr 
145 150 155 ' 160 

Leu Lys Ser Ala Trp Asn Ser He Leu Ala Ser Leu Ala Leu Trp Asp 
165 170 175 

Phe Leu Val Leu Phe Phe Cys Leu Pro He Val He Phe Asn Glu He 
180 185 190 

Thr Lys Gin Arg Leu Leu Gly Asp Val Ser Cys Arg Ala Val Pro Phe 
195 200 205 

Met Glu Val Ser Ser Leu Gly Val Thr Thr Phe Ser Leu Cys Ala Leu 
210 215 220 

Gly He Asp Arg Phe His Val Ala Thr Ser Thr Leu Pro Lys Val Arg 
225 230 235 240 

Pro He Glu Arg Cys Gin Ser He Leu Ala Lys Leu Ala Val He Trp 
245 250 255 

Val Gly Ser Met Thr Leu Ala Val Pro Glu Leu Leu Leu Trp Gin Leu 
260 265 270 

Ala Gin Glu Pro Ala Pro Thr Met Gly Thr Leu Asp Ser Cys He Met 
275 280 285 

Lys Pro Ser Ala Ser Leu Pro Glu Ser Leu Tyr Ser Leu Val Met Thr 
290 295 300 

Tyr Gin Asn Ala Arg Met Trp Trp Tyr Phe Gly Cys Tyr Phe Cys Leu 
305 310 315 320 

Pro He Leu Phe Thr Val Thr Cys Gin Leu Val Thr Trp Arg Val Arg 

325 330 335 

Gly Pro Pro Gly Arg Lys Ser Glu Cys Arg Ala Ser Lys His Glu Gin 
340 345 350 

Cys Glu Ser- Gin Leu Lys Ser Thr Val Val Gly Leu Thr Val Val Tyr 
355 360 365 

Ala Phe Cys Thr Leu Pro Glu Asn Val Cys Asn He Val Val Ala Tyr 
370 375 380 

Leu Ser Thr Glu Leu Thr Arg Gin Thr Leu Asp Leu Leu Gly Leu He 
385 390 395 4OO 
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Asn Gin Phe Ser Thr Phe Phe Lys Gly Ala He Thr Pro Val 



405 



I>eu Leu 
415 



Leu Cys He Cys Arg Pro Leu Gly Gin Ala Phe Leu Asp Cys Cys Cys 



420 



430 



^ cys cys cys Cys Glu Glu Cys Gly Gly Ala Ser Glu Ala Ser Ala Ala 

435 440 445 

Asn Gly Ser Asp Asn Lys Leu Lys Thr Glu Val Ser Ser Ser He Tvr 

«5 460 ^ 

10 ^""^ Gly Thr Pro 

475 . 480 

Cys 

(210) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1101 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



20 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 09: 

ATGTGGAACG CGACGCCCAG CGAAGAGCCG GGGTrCAACC TCACACTGGC CGACCTCGAC 60 

TGGGATCCrr CCCCCGGCAA CGACTCGCK GGCGACGAGC TGCOTSCAGCT CTTCCCCGCG 120 

CCGCTGCTGG CGGGCGTCAC AGCCACCTGC GTGGCACTCT TCGTGGTGGG TATCGCTGGC 180 

AACCTGCTCA CCATGCTGGT GGTGTCGCGC TTCCGCGAGC TGCGCACCAC CACCAACCTC 240 

25 TACCTGTCCA GCATGGCCTT CTCCGATCT6 CTCATCTTCC TCTCCATGCC CCTGGACCTC 300 

6TTCGCCTCT GGCAGTACCG 6CCCTGGAAC TTCGGCGACC TCCTCTGCAA ACTCTTCCAA 360 

TTCGTCAGTG AGAGCTGCAC CTACGCCACG GT^CTCACCA TCACAGCGCT GAGCGTCGAG 420 

CGCTACTTCG CCATCTGCTT CCCACTCCGG GCCAAGGTGG TGGTCACCAA GGGGCGGGTG 480 
AAGCTGGTCA TCTTCGTCAT CTGGGCCGTG GCCTTCTOCA GCGCCGGGCC CATCTTCGTC - 540 

30 CTAGTCGGGG TGGAGCACGA GAACGGCACC GACCCTTOGG ACACCAACGA GTOCCGCCCC 600 

ACCGAGTTTG CGGTGCGCTC TGGACTGCTC ACGGTCATGG TGTGGGTGTC CAGCATCTTC 660 

TTCTTCCTTC CTGTCTTCTG TCTCACGGTC CTCTACAGTC TCATCGGCAG QAAGCTGT^G 720 

CGGAGGAGGC GCGGCGATGC TGTCGTGGGT GCCTCGCTCA GGGACCAGAA CCACAAGCAA 780 
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ACCAAGAAAA TGCTGGCTGT AGTGGTGTTT GCCTTCATCC TCTGCTGGCT CCCCTTCCAC 840 

GTAGGGCGAT ATTTATTTTC CAAATCCTTT GAGCCTGGCT CCTTGGAGAT TGCTCAGATC 900 

AGCCAGTACT GCAACCTCGT GTCCTTTGTC CTCTTCTACC TCAGTGCTGC CATCAACCCC 960 

ATTCTGTACA ACATCATGTC CAAGAAGTAC CGGGTGGCAG TGTTCAGACT TCTGGGATTC 1020 

GAACCCTTCT CCCAGAGAAA GCTCTCCACT CTGAAAGATG AAAGTTCTCG GGCCTGGACA 1080 

GAATCTAGTA TTAATACATG A 1101 
(211) INFORMATION FOR SEQ ID N0:210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 366 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 

Met Trp Asn Ala Thr Pro Ser Glu Glu Pro Gly Phe Asn Leu Thr Leu 
15 10 15 

Ala Asp Leu Asp Trp Asp Ala Ser Pro Gly Asn Asp Ser Leu Gly Asp 
20 25 30 

Glu Leu Leu Gin Leu Phe Pro Ala Pro Leu Leu Ala Gly Val Thr Ala 
35 40 45 

Thr Cys Val Ala Leu Phe Val Val Gly lie Ala Gly Asn Leu Leu Thr 
50 55 60 

Met Leu Val Val Ser Arg Phe Arg Glu Leu Arg Thr Thr Thr Asn Leu 
65 70 75 80 

Tyr Leu Ser Ser Met Ala Phe Ser Asp Leu Leu lie Phe Leu Cys Met 
85 90 95 

Pro Leu Asp Leu Val Arg Leu Trp Gin Tyr Arg Pro Trp Asn Phe Gly 
100 105 110 

Asp Leu Leu Cys Lys Leu Phe Gin Phe Val Ser Glu Ser Cys Thr Tyr 
115 120 125 

Ala Thr Val Leu Thr He Thr Ala Leu Ser Val Glu Arg Tyr Phe Ala 
130 135 140 

He Cys Phe Pro Leu Arg Ala Lys Val Val Val Thr Lys Gly Arg Val 
145 150 155 160 

Lys Leu Val He Phe Val He Trp Ala Val Ala Phe Cys Ser Ala Gly 
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165 170 

Pro lie Phe Val Leu Val Gly Val Glu His Glu Asn Gly Thr Asp Pro 
180 185 • 190 

Trp Asp Thr Asn Glu Cys Arg Pro Thr Glu Phe Ala Val Arg Ser Gly 
195 200 205 

Leu Leu Thr Val Met Val Trp Val Ser Ser lie Phe Phe Phe Leu Pro 
210 215 220 

Val Phe Cys Leu Thr Val Leu Tyr Ser Leu He Gly Arg Lys Leu Trp 

230 235 240 

Arg Arg Arg Arg Gly Asp Ala Val Val Gly Ala Ser Leu Arg Asp Gin 
245 250 255 

Asn His Lys Gin Thr Lys Lys Met Leu Ala Val Val Val Phe Ala Phe 
260 265 270 

He Leu Cys Trp Leu Pro Phe His Val Gly Arg Tyr Leu Phe Ser Lys 
275 . 280 285 

Ser Phe Glu Pro Gly Ser Leu Glu He Ala Gin He Ser Gin Tvr Cvs 
290 295 300 

Asn Leu Val Ser Phe Val Leu Phe Tyr Leu Ser Ala Ala He Asn Pro 
. 310 315 320 

He Leu Tyr Asn He Met Ser Lys Lys Tyr Arg Val Ala Val Phe Arg 
325 330 335 

Leu Leu Gly Phe Glu Pro Phe Ser Gin Arg Lys Leu Ser Thr Leu Lys 
340 345 350 

Asp Glu Ser Ser Arg Ala Trp Thr Glu Ser Ser He Asn Thr 
355 360 3G5 

(212) INFORMATION FOR SEQ ID NO:211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOIiECULE TYPE: DNA (genomic) 

(xi)' SEQUENCE DESCRIPTION: SEQ ID NO: 211: 

ATGC6AGCCC CGGGCGCGCT TCTCGCCCGC ATGTCGCGGC TACTCCTTCT GCTACTGCTC 60 

AAGGTGTCTG CCTCTTCTGC CCTCGGGGTC GCCCCTGCGT CCAGAAACGA AACTTGTCTG 120 

GGGGAGAGCT GTGCACCTAC AGTGATCCAG CGCCGCGGCA GGGACGCCTG GGGACCGGGA 180 
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AATTCTGCAA GAGACGTTCT GCGAGCCCGA 
CTTGCGGGAC CCTCCTGGGA CCTGCCGGCG 
GGGGCGGAGG CGTCGGCAGC CGGACCCCCG 
AGGTGGAAAG GTGCTCGGGG TCAGGAGCCT 
GCCCTCCAGC TCTTCCTTCA GATCTCAGAG 
ATTTCCGGGC GTAGCCAGGA GCAGAGTGTG 
TACTGGCCAA GGAGAGCCGG GAAACTCCAG 
GCCAATGGAC TGGCGGGGCA CGAAGGGTGG 
CAGAATGGAT CCTTGGGTGA AGGAATCCAT 
ACGAACCGGC GTGTGAGACT GAAGAACCCC 
GCCTACGCGG TCATGTGTCT GTCCGTGGTG 
GCGGTGATGT GCATCGTGTG CCACAACTAC 
GCCAACCTGG CCTTCTGGGA CTTTCTCATC 
CACGAGCTGA CCAAGAAGTG GCTGCTGGAG 
GAGGTCGCCT CTCTGGGAGT CACCACTTTC 
CGTGCTGCCA CCAACGTACA GATGTACTAC 
GCCAAACTTG CTGTTATATG GGTGGGAGCT 
CGCCAGCTGA GCAAGGAGGA TTTGGGGTTT 
ATTAAGATCT CTCCTGATTT ACCAGACACC 
GCGAGACTGT GGTGGTATTT TGGCTGTTAC 
TGCTCTCTAG TGACTGCGAG GAAAATCCGC 
AAACGGCAGA TTCAACTAGA GAGTCAGATG 
TATGGATTTT GCATTATTCC TGAAAATATC 
GGGGTTTCAC AGCAGACAAT GGACCTCCTT 
AAGTCCTGTG TCACCCCAGT CCTCCTTTTC 
ATGGAGTGCT GCTGCTGTTG CTGTGAGGAA 
GATGACAATG ACAACGAGTA CACCACGGAA 
CGTGAAATGT CCACTTTTGC TTCTGTCGGA 



169 

GCACCCAGGG AGGAGCAGGG GGCAGCGTTT 240 

GCCCCGGGCC GTGACCCGGC TGCAGGCAGA 300 

GGACCTCCAA CCAGGCCACC TGGCCCCTGG 360 

TCTGAAACTT TGGGGAGAGG GAACCCCACG 420 

GAG6AAGAGA AGGGTCCCAG AGGCGCT6GC 480 

AAGACAGTCC CCGGAGCCAG CGATCTTTTT 540 

GGTTCCCACC ACAAGCCCCT GTCCAAGACG 600 

ACAATTGCAC TCCCGGGCCG GGCGCTGGCC 660 

GAGCCTGGGG GTCCCCGCCG GGGAAACAGC 720 

TTCTACCCGC TGACCCAGGA GTCCTATGGA 780 

ATCTTCGGGA CCGGCATCAT TGGCAACCTG 840 

TACATGCGGA GCATCTCCAA CTCCCTCTTG 900 

ATCTTCTTCT GCCTTCCGCT GGTCATCTTC 960 

GACTTCTCCT GCAAGATCGT GCCCTATATA 1020 

ACCTTATGTG CTCTGTGCAT AGACCGCTTC 1080 

GAAATGATCG AAAATTGTTC CTCAACAACT 1140 

CTATTGTTAG CACTTCCAGA AGTTGTTCTC 1200 

AGTGGCCGAG CTCCGGCAGA AAGGTGCATT 1260 

ATCTATGTTC TAGCCCTCAC CTACGACAGT 1320 

TTTTGTTTGC CCACGCTTTT CACCATCACC 1380 

AAAGCAGAGA AAGCCTGTAC CCGAGGGAAT 1440 

AAGTGTACAG TAGTGGCACT GACCATTTTA 1500 

TGCAACATTG TTACTGCCTA CATGGCTACA 1560 

AATATCATCA GCCAGTTCCT TTTGTTCTTT 1620 

TGTCTCTGCA AACCCTTCAG TCGGGCCTTC 1680 

TGCATTCAGA AGTCTTCAAC GGTGACCAGT 1740 

CTCGAACTCT CGCCTTTCAG TACCATACGC 1800 

ACTCATTGCT GA 1842 
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(213) INFORMATION FOR SEQ ID NO:212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:212: 
Met Arg Ala Pro Gly Ala Leu Leu Ala Arg Met Ser Arg Leu Leu Leu 



10 



15 



Leu Leu Leu Leu Lys Val Ser Ala Ser Ser Ala Leu Gly Val Ala Pro 
^° 25 30 

Ala ser Arg Asn Glu Thr Cys Leu Gly Glu Ser Cys Ala Pro Thr Val 
35 40 45 

He Gin Arg Arg Gly Arg Asp Ala Trp Gly Pro Gly Asn Ser Ala Arg 
^° 55 60 

Asp Val Leu Arg Ala Arg Ala Pro Arg Glu Glu Gin Gly Ala Ala Phe 
" 70 75 

Leu Ala Gly Pro Ser Trp Asp Leu Pro Ala Ala Pro Gly Arg Asp Pro 
85 90 35 

Ala Ala Gly Arg Gly Ala Glu Ala Ser Ala Ala Gly Pro Pro Gly Pro 
100 105 110 

Pro Thr Arg Pro Pro Gly Pro Trp Arg Tzp Lys Gly Ala Arg Gly Gin 
115 120 125 

Glu Pro Ser Glu Thr Leu Gly Arg Gly Asn Pro Thr Ala Leu Gin Leu 
130 135 140 

Phe Leu Gin He Ser Glu Glu Glu Glu Lys Gly Pro Arg Gly Ala Gly 

ISO 155 160 

He ser Gly Arg Ser Gin Glu Gin Ser Val Lys Thr Val Pro Gly Ala 



165 170 



175 



ser Asp Leu Phe Tyr Trp Pro Arg Arg Ala Gly Lys Leu Gin Gly Ser 
180 185 190 

His His Lys Pro Leu Ser Lys Thr Ala Asn Gly Leu Ala Gly His Glu 
195 200 205 

Gly Trp Thr He Ala Leu Pro Gly Arg Ala Leu Ala Gin Asn Gly Ser 
" 215 220 



Leu Gly Glu Gly He His Glu Pro Gly Gly Pro Arg Arg Gly Asn 



Ser 
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225 230 235 240 

Thr Asn Arg Arg Val Arg Leu Lys Asn Pro Phe Tyr Pro Leu Thr Gin 
245 250 255 

Glu Ser Tyr Gly Ala Tyr Ala Val Met Cys Leu Ser Val Val He Phe 
260 265 270 

Gly Thr Gly lie He Gly Asn Leu Ala Val Met Cys He Val Cys His 
275 280 285 

Asn Tyr Tyr Met Arg Ser He Ser Asn Ser Leu Leu Ala Asn Leu Ala 
290 295 300 

Phe Trp Asp Phe Leu He He Phe Phe Cys Leu Pro Leu Val He Phe 
305 310 315 320 

His Glu Leu Thr Lys Lys Trp Leu Leu Glu Asp Phe Ser Cys Lys He 
325 330 335 

Val Pro Tyr He Glu Val Ala Ser Leu Gly Val .Thr Thr Phe Thr Leu 
340 345 350 

Cys Ala Leu Cys He Asp Arg Phe Arg Ala Ala Thr Asn Val Gin Met 
355 360 365 

Tyr Tyr Glu Met He Glu Asn Cys Ser Ser Thr Thr Ala Lys Leu Ala 
370 375 380 

Val He Trp Val Gly Ala Leu Leu Leu Ala Leu Pro Glu Val Val Leu 
385 390 395 400 

Arg Gin Leu Ser Lys Glu Asp Leu Gly Phe Ser Gly Arg Ala Pro Ala 
405 • 410 415 

Glu Arg Cys He He Lys He Ser Pro Asp Leu Pro Asp Thr He Tyr 
420 425 430 

Val Leu Ala Leu Thr Tyr Asp Ser Ala Arg Leu Trp Trp Tyr Phe Gly 
435 440 445 

Cys Tyr Phe Cys Leu Pro Thr Leu Phe Thr He Thr Cys Ser Leu Val 
450 455 460 

Thr Ala Arg Lys He Arg Lys Ala Glu Lys Ala Cys Thr Arg Gly Asn 
465 470 475 480 

Lys Arg Gin He Gin Leu Glu Ser Gin Met Lys Cys Thr Val Val Ala 
485 490 495 

Leu Thr He Leu Tyr Gly Phe Cys He He Pro Glu Asn He Cys Asn 
500 505 510 

He Val Thr Ala Tyr Met Ala Thr Gly Val Ser Gin Gin Thr Met Asp 
515 520 525 
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172 

Leu Leu Asn He He Ser Gin Phe Leu Leu Phe Phe Lys Ser Cys Val 
530 535 

Thr Pro val Leu Leu Phe Cys Leu Cys Lys Pro Phe Ser Arg Ala Phe 

550 555 

Met Glu Cys Cys Cys Cys Cys Cys Glu Glu Cys lie Gin Lys Ser Ser 
565 570 

Thr val Thr Ser Asp Asp Asn Asp Asn Glu Tyr Thr Thr Glu Leu Glu 
580 585 . 590 

Leu ser Pro Phe Ser Thr He Arg Arg Glu Met Ser Thr Phe Ala Ser 
595 600 



605 



Val Gly rbx His Cys 
610 

(214) INFORMATION FOR SEQ ID NO: 213: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 1248 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213: 

ATGGTTTTTG CTCACAGAAT GGATAACAGC AAGCCACATT TGATTATTCC TACACTTCTG 60 

GTGCCCCTCC AAAACCGCAG CTGCACTGAA ACAGCCACAC CTCTCCCAAG CCAATACCTG 120 

ATGGAATTAA GTGAGGAGCA CAGTTGGATG AGCAACCAAA CAGACCTTCA CTATGTGCTG 180 

AAACCCGGGG AAGTGGCCAC AGCCAGCATC TTCTTTGGGA TTCTCTGGTT GTTTTCTATC 240 

25 TTCGGCAATT CCCTGGTTTG TTTGGTCATC CATAGGAGTA GGAGGACTCA GTCTACCACC 300 

AACTACTTTG TGGTCTCCAT GGCATGTGCT GACCTTCTCA TCAGCGTTGC CAGCACGCCT 360 

TTCGTCCTGC TCCAGTTCAC CACTGGAAGG TGGACGCTGG GTAGTGCAAC GTGCAAGGTT 420 

GTGCGATATT TTCAATATCT CACTCCAGGT GTCCAGATCT ACGTTCTCCT CTCCATCTCC 480 

ATAGACCGGT TCTACACCAT CGTCTATCCT CTGAGCTTCA AGGTGTCCAG AGAAAAAGCC 540 

30 AAGAAAATGA TTGCGGCATC GTGGATCTTT GATGCAGGCT TTGTGACCCC TGTGCTCTTT 600 

TTCTATGGCT CCAACTGGGA CAGTCATTGT AACTATTTCC TCCCCTCCTC TTGGGAAGGC 660 

ACTGCCTACA CTGTCATCCA CTTCTTGGTG GGCTTT6TGA TTCCATCTGT CCTCATAATT 720 

TTATTTTACC AAAAGGTCAT AAAATATATT TGGAGAATAG GCACAGATGG CCGAACGGTG 780 
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AGGAGGACAA TGAACATTGT CCCTCGGACA AAAGTGAAAA CTAAAAAGAT GTTCCTCATT 840 

TTAAATCTGT TGTTTTTGCT CTCCTGGCTG CCTTTTCATG TAGCTCAGCT ATGGCACCCC 900 

CATGAACAAG ACTATAAGAA AAGTTCCCTT GTTTTCACAG CTATCACATG GATATCCTTT 560 

AGTTCTTCAG CCTCTAAACC TACTCTGTAT TCAATTTATA ATGCCAATTT TCGGAGAGGG 102 0 

ATGAAAGAGA CTTTTTGCAT GTCCTCTATG AAATGTTACC GAAGCAATGC CTATACTATC 1080 

ACAACAAGTT CAAGGATGGC CAAAA/U^C TACGTTGGCA TTTCAGAAAT CCCTTCCATG 1140 

GCCAAAACTA TTACCAAAGA CTCGATCTAT GACTCATTTG ACAGAGAAGC CAAGGAAAAA 1200 

AAGCTTGCTT GGCCCATTAA CTCAAATCCA CCAAATACTT TTGTCTAA 1248 
(215) INFORMATION FOR SEQ ID N0:214: 

(i) . SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 415 amino acids 

(B) TYPE: amino acid 

(C) STRT^EDNESS: 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 

Met Val Phe Ala His Arg Met Asp Asn Ser Lys Pro His Leu lie lie 
1 5 10 15 

Pro Thr Leu Leu Val Pro Leu Gin Asn Arg Ser Cys Thr Glu Thr Ala 
20 25 30 

Thr Pro Leu Pro Ser Gin Tyr Leu Met Glu Leu Ser Glu Glu His Ser 
35 40 45 

Trp Met Ser Asn Gin Thr Asp Leu His Tyr Val Leu Lys Pro Gly Glu 
50 55 60 

Val Ala Thr Ala Ser lie Phe Phe Gly He Leu Trp Leu Phe Ser He 
65 70 75 80 

Phe Gly Asn Ser Leu Val Cys Leu Val He His Arg Ser Arg Arg Thr 
85 90 95 

Gin Ser Thr Thr Asn Tyr Phe Val Val Ser Met Ala Cys Ala Asp Leu 
100 105 110 

Leu He Ser Val Ala Ser Thr Pro Phe Val Leu Leu Gin Phe Thr Thr 
115 120 125 

Gly Arg Trp Thr Leu Gly Ser Ala Thr Cys Lys Val Val Arg Tyr Phe 
130 135 140 
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Gin Tyr Leu Thr Pro Gly Val Gin He Tyr Val Leu Leu Ser He 



145 i^n — 



150 

He Asp Arg Phe Tyr Thr lie Val Tyr Pro Leu Ser Phe Lys Val Ser 



1« 175 



Arg Glu Lys Ala Lys Lys Met He Ala Ala Ser Trp He Phe Asp Ala 
"0 185 

Gly Phe Val Thr Pro Val Leu Phe Phe Tyr Gly. Ser Asn Trp Asp Ser 
195 200 205 

His Cys Asn Tyr Phe Leu Pro Ser Ser Trp Glu Gly Thr Ala Tyr Thr' 
210 215 220 

Val He His Phe Leu Val Gly Phe Val He Pro Ser Val Leu He He 

230 235 240 

Leu Phe Tyr Gin Lys Val He Lys Tyr He Trp Arg He Gly Thr Asp 



245 250 



255 



Gly Arg Thr Val Arg Arg Thr Met Asn He Val ' Pro Arg Thr Lys Val 
260 265 270 

Lys Thr Lys Lys Met Phe Leu He Leu Asn Leu Leu Phe Leu Leu Ser 
275 280 285 

Trp Leu Pro Phe His Val Ala Gin Leu Trp His Pro His Glu Gin Asp 

295 • 300 



Tyr Lys Lys Ser Ser Leu Val Phe Thr Ala He Thr Trp He Ser Phe 

320 



305 33^5 



Tyr Ser He Tyr Asn Ala Asn 

335 



Ser Ser Ser Ala Ser Lys Pro Thr Leu 
325 

Phe Arg Arg Gly Met Lys Glu Thr Phe Cys Met Ser Ser Met Lys Cys 

345 350 

Tyr Arg Ser Asn Ala Tyr Thr He Thr Thr Ser Ser Arg Met Ala Lys 
3,55 360 3g5 

Lys Asn Tyr Val Gly He Ser Glu He Pro Ser Met Ala Lys Thr He 

375 380 

Thr Lys ASP ser He Tyr Asp Ser Phe Asp Arg Glu Ala Lys Glu Lys 

390 395 /^^ 

Lys Leu Ala Trp Pro He Asn Ser Asn Pro Pro Asn Thr Phe Val 
"5 410 415 

(216) INFORMATION FOR SEQ ID NO: 215: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1842 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:215: 

ATGGGGCCCA CCCTAGCGGT TCCCACCCCC TATGGCTGTA TTGGCTGTAA GCTACCCCAG 60 

CCAGAATACC CACCGGCTCT AATCATCTTT ATGTTCTGCG CGATGGTTAT CACCATCGTT 120 

GTAGACCTAA TCGGCAACTC CATGGTCATT TTGGCTGTGA CGAAGAACAA GAAGCTCCGG 180 

AATTCTGGCA ACATCTTCGT GGTCAGTCTC TCTGTGGCCG ATATGCTGGT GGCCATCTAC 240 

CCATACCCTT TGATGCTGCA TGCCATGTCC ATTGGGGGCT GGGATCTGAG CCAGTTACAG 300 

TGCCAGATGG TCGGGTTCAT CACAGGGCTG AGTGTGGTCG GCTCCATCTT CAACATCGTG 360 

GCAATCGCTA TCAACCGTTA CTGCTACATC TGCCACAGCC TCCAGTACGA ACGGATCTTC 420 

AGTGTGCGCA ATACCTGCAT CTACCTGGTC ATCACCTGGA TCATGACCGT CCTGGCTGTC 480 

CTGCCCAACA TGTACATTGG CACCATCGAG TACGATCCTC GCACCTACAC CTGCATCTTC 540 

AACTATCTGA ACAACCCTGT CTTCACTGTT ACCATCGTCT GCATCCACTT CGTCCTCCCT 600 

CTCCTCATCG TGGGTTTCTG CTACGTGAGG ATCTGGACCA AAGTGCTGGC GGCCCGTGAC 660 

CCTGCAGGGC AGAATCCTGA CAACCAACTT GCTGAGGTTC GCAATAAACT AACCATGTTT 720 

GTGATCTTCC TCCTCTTTGC AGTGTGCTGG TGCCCTATCA ACGTGCTCAC TGTCTTGGTG 780 

GCTGTCAGTC CGAAGGAGAT GGCAGGCAAG ATCCCCAACT GGCTTTATCT TGCAGCCTAC 840 

TTCATAGCCT ACTTCAACAG CTGCCTCAAC GCTGTGATCT ACGGGCTCCT CAATGAGAAT 900 

TTCCGAAGAG AATACTGGAC CATCTTCCAT GCTATGCGGC ACCCTATCAT ATTCTTCTCT 960 

GGCCTCATCA GTGATATTCG TGAGATGCAG GAGGCCCGTA CCCTGGCCCG CGCCCGTGCC 1020 

CATGCTCGCG ACCAAGCTCG TGAACAAGAC CGTGCCCATG CCTGTCCTGC TGTGGAGGAA 1080 

ACCCCGATGA ATGTCCGGAA TGTTCCATTA CCTGGTGATG CTGCAGCTGG CCACCCCGAC 1140 

CGTGCCTCTG GCCACCCTAA GCCCCATTCC AGATCCTCCT CTGCCTATCG CAAATCTGCC 1200 

TCTACCCACC ACAAGTCTGT CTTTAGCCAC TCCAAGGCTG CCTCTGGTCA CCTCAAGCCT 1260 

GTCTCTGGCC ACTCCAAGCC TGCCTCTGGT CACCCCAAGT CTGCCACTGT CTACCCTAAG 1320 

CCTGCCTCTG TCCATTTCAA GGCTGACTCT GTCCATTTCA AGGGTGACTC TGTCCATTTC 1380 

AAGCCTGACT CTGTTCATTT CAAGCCTGCT TCCAGCAACC CCAAGCCCAT CACTGGCCAC 1440 
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CATGTCTCTG CTOSCAGCCA CTCCAAGTCT GCCTTCAATG CTGCCACCAG CCACCCTAAA 1500 
CCCATCAAGC CAGCTACCAG CCATGCTGAG CCCACCACTC CTGACTATCC CAAGCCTGCC 1560 
ACTACCAGCC ACCCTAAGCC CGCTGCTGCT GACAACCCTG AGCTCTCTCC CTCCCATTGC 1620 
CCCGAGATCC CTGCCATTGC CCACCCTGTG TCTGACGACA GTGACCTCCC TGAGTCGGCC 1680 
TCTAGCCCTG CCGCTGGGCC CACCAAGCCT GCTGCCAGCC AGCTCGAGTC TGACACCATC 1740 
GCTQACCTTC CTGACCCTAC TGTAGTCACT ACCAGTACCA ATGATTACCA TGATGTCGTG 1800 
GTTGTTGATG TTGAAGATGA TCCTGATGAA ATGGCTGTGT GA 
(217) INFORMATION FOR SEQ ID NO: 216: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:216: 
Met Gly Pro Thr Leu Ala Val Pro Thr Pro Tyr Gly Cys He Gly Cys 



15 



Lys Leu Pro Gin Pro Glu Tyr Pro Pro Ala Leu He He Phe Met Phe 
20 25 30 

Cys Ala Met Val He Thr He Val Val Asp Leu He Gly Asn Ser Met 
35 40 



45 



Val He Leu Ala Val Thr Lys Asn Lys Lys Leu Arg Asn Ser Gly Asri 

55 50 



He Phe Val Val Ser Leu Ser Val Ala Asp Met Leu Val Ala He Tyr 
" 75 80 

Pro Tyr Pro Leu Met Leu His Ala Met Ser He Gly Gly Trp Asp Leu 



85 90 



95 



ser Gin Leu Gin Cys Gin Met Val Gly Phe He Thr Gly Leu Ser Val 
100 105 110 

Val Gly ser He Phe Asn He Val Ala He Ala He Asn Arg Tyr Cys 
115 125 

Tyr He Cys His Ser Leu Gin Tyr Glu Arg He Phe Ser Val Arg Asn 



140 



Thr Cys He Tyr Leu Val He Thr Trp He Met Thr Val Leu Ala Val 

150 155 1^0 
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Leu Pro Asn Met Tyr He Gly Thr He Glu Tyr Asp Pro Arg Thr Tyr 
165 170 175 

Thr Cys He Phe Asn Tyr Leu Asn Asn Pro Val Phe Thr Val Thr He 
180 185 190 

5 Val Cys He His Phe Val Leu Pro Leu Leu He Val Gly Phe Cys Tyr 

195 200 205 

Val Arg He Trp Thr Lys Val Leu Ala Ala Arg Asp Pro Ala Gly Gin 
210 215 220 

Asn Pro Asp* Asn Gin Leu Ala Glu Val Arg Asn Lys Leu Thr Met Phe 
10 225 230 235 240 

Val He Phe Leu Leu Phe Ala Val Cys Trp Cys Pro He Asn Val Leu 
245 250 255 

Thr Val Leu Val Ala Val Ser Pro Lys Glu Met Ala Gly Lys He Pro 
260 265 270 

15 Asn Trp Leu Tyr Leu Ala Ala Tyr Phe He Ala Tyr Phe Asn Ser Cys 

275 280 285 

Leu Asn Ala Val He Tyr Gly Leu Leu Asn Glu Asn Phe Arg Arg Glu 
290 295 300 

Tyr Trp Thr He Phe His Ala Met Arg His Pro He He Phe Phe Ser 
20 305 310 315 320 

Gly Leu He Ser Asp He Arg Glu Met Gin Glu Ala Arg Thr Leu Ala 
325 330 335 

Arg Ala Arg Ala His Ala Arg Asp Gin Ala Arg Glu Gin Asp Arg Ala 
340 345 350 

25 His Ala Cys Pro Ala Val Glu Glu Thr Pro Met Asn Val Arg Asn Val 

355 360 365 

Pro Leu Pro Gly Asp Ala Ala Ala Gly His Pro Asp Arg Ala Ser Gly 
370 375 380 

His Pro Lys Pro His Ser Arg Ser Ser Ser Ala Tyr Arg Lys Ser Ala 
30 385 390 395 400 

Ser Thr His His Lys Ser Val Phe Ser His Ser Lys Ala Ala ' Ser Gly 
405 410 415 

His Leu Lys Pro Val Ser Gly His Ser Lys Pro Ala Ser Gly His Pro 
420 425 430 



35 



Lys Ser Ala Thr Val Tyr Pro Lys Pro Ala Ser Val His Phe Lys Ala 
435 440 445 



Asp Ser Val His Phe Lys Gly Asp Ser Val His Phe Lys Pro Asp Ser 
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455 

Val His Phe Lys Pro Ala Ser Ser Asn Pro Lys Pro He Thr Gly His 
" 475 

His val ser Ala Gly Ser His Ser Lys Ser Ala Phe Asn Ala Ala Thr 
485 



490 



495 



ser His Pro Lys Pro lie Lys Pro Ala Thr Ser His Ala Glu Pro Thr 



505 



510 



Thr Ala Asp Tyr Pro Lys Pro Ala Thr Thr Ser His Pro Lys Pro Ala 
5i5 520 



525 



10 



Ala Ala Asp Asn Pro Glu Leu Ser Ala Ser His Cys Pro Glu He Pro 
He 

545 

5go 



535 540 
Ala lie Ala His Pro Val Ser Asp Asp Ser Asp Leu Pro Glu Ser Ala 



15 



ser ser Pro Ala Ala Gly Pro Thr Lys Pro Ala Ala Ser Gin Leu Glu 
565 



570 



575 



ser Asp Thr He Ala Asp Leu Pro Asp Pro Thr Val Val Thr Thr Ser 

585 590 

Thr Asn Asp Tyr His Asp Val, Val Val Val Asp Val Glu Asp Asp Pro 



3 



595 600 

Asp Glu Met Ala Val 
610 

(218) INFORMATION FOR SEQ ID NO: 217: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1854 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY:' linear 



605 



• (ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 
ATGGGGCCCA CCCTAGCGGT TCCCACCCCC TATGGCTGTA TTGGCTGTAA GCTACCCCAG 
CCAGAATACC CACCGGCTCT AATCATCTTT ATGTTCTGCG CGATGGTTAT CACCATCGTT 
GTAGACCTAA TCGGCAACTC CATGGTCAIT TTGGCTGTGA CGAAGAACAA GAAGCTCCGG 
AATTCTGGCA ACATCTTCGT GGTCAGTCTC TCTGTGGCCG ATATGCTGGT GGCCATCTAC 
CCATACCCTT TGATGCTGCA TGCCATGTCC ATTGGGGGCT GGGATCTOAG CCAGTTACAG 
TGCCAGATGG TCGGGTTCAT CACAGGGCTG AGTGTGGTCG GCTCCATCTT CAACATCGTG 



60 
120 
180 
240 
300 
360 
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GCAATCGCTA 


TCAACCGTTA 


CTGCTACATC 


TGCCACAGCC 


TCCAGTACGA 


ACGGATCTTC 


420 


AGTGTGCGCA 


ATACCTGCAT 


CTACCT<3GTC 


ATCACCTGGA 


TCATGACCGT 


CCTGGCTGTC 


480 


CTGCCCAACA 


TGTACATTGG 


CACCATCGAG 


TACGATCCTC 


GCACCTACAC 


CTGCATCTTC 


540 


AACTATCTGA 


ACAACCCTGT 


CTTCACTGTT 


ACCATCGTCT 


GCATCCACTT 


CGTCCTCCCT 


600 


CTCCTCATCG 


TGGGTTTCTG 


CTACGTGAGG ATCTGGACCA AAGTGCTGGC GGCCCGTGAC 


660 


CCTGCAGGGC 


AGAATCCTGA 


CAACCAACTT 


GCTGAGGTTC 


GCAATAAACT 


AACCATGTTT 


720 


GTGATCTTCC 


TCCTCTTTGC 


AGTGTGCTGG 


TGCCCTATCA 


ACGTGCTCAC 


TGTCTTGGTG 


780 


GCTGTCAGTC 


CGAAGGAGAT 


GGCAGGCAAG 


ATCCCCAACT 


GGCTTTATCT 


TGCAGCCTAC 


840 


TTCATAGCCT 


ACTTCAACAG 


CTGCCTCAAC 


GCTGTGATCT 


ACGGGCTCCT 


CAATGAGAAT 


900 


TTCCGAAGAG 


AATACTGGAC 


CATCTTCCAT 


GCTATGCGGC 


ACCCTATCAT 


ATTCTTCTCT 


960 


GGCCTCATCA 


GTGATATTCG 


TGAGATGCAG 


GAGGCCCGTA 


CCCTGGCCCG 


CGCCCGTGCC 


1020 


CATGCTCGCG 


ACCAAGCTCG 


TGAACAAGAC 


CGTGCCCATG 


CCTGTCCTGC 


TGTGGAGGAA 


1080 


ACCCCGATGA 


ATGTCCGGAA 


TGTTCCATTA 


CCTGGTGATG 


CTGCAGCTGG 


CCACCCCGAC 


1140 


CGTGCCTCTG 


GCCACCCTAA 


GCCCCATTCC 


AGATCCTCCT 


CTGCCTATCG 


CAAATCTGCC 


1200 


TCTACCCACC 


ACAAGTCTGT 


CTTTAGCCAC 


TCCAAGGCTG 


CCTCTGGTCA 


CCTCAAGCCT 


1260 


GTCTCTGGCC 


ACTCCAAGCC 


TGCCTCTGGT 


CACCCCAAGT 


CTGCCACTGT 


CTACCCTAAG 


1320 


CGTGCCTCTG 


TCCATTTCAA 


GGCTGACTCT 


GTCCATTTCA 


AGGGTGACTC 


TGTCCATTTC 


1380 


AAGCCTGACT 


CTGTTCATTT 


CAAGCCTGCT 


TCCAGCAACC 


CCAAGCCCAT 


CACTGGCCAC 


1440- 


CATGTCTCTG 


CTGGCAGCCA 


CTCCAAGTCT 


GCCTTCAGTG 


CTGCCACCAG 


CCACCCTAAA 


1500 


CCCACCACTG 


GCCACATCAA 


GCCAGCTACC 


AQCCATGCTG 


AGCCCACCAC 


TGCTGACTAT 


1560 


CCCAAGCCTG 


CCACTACCAG 


CCACCCTAAG 


CCCACTGCTG 


CTGACAACCC 


TGAGCTCTCT 


1620 


GCCTCCCATT 


GCCCCGAGAT 


CCCTGCCATT 


GCCCACCCTG 


TGTCTGACGA 


CAGTGACCTC 


1680 


CCTGAGTCGG 


CCTCTAGCCC 


TGCCGCTGGG 


CCCACCAAGC 


CTGCTGCCAG 


CCAGCTGGAG 


1740 


TCTGACACCA 


TCGCTGACCT 


TCCTGACCCT 


ACTGTAGTCA 


CTACCAGTAC 


CAATGATTAC 


1800 


CATGATGTCG 


TGGTTGTTGA 


tgttgaagat 


GATCCTGATG 


AAATGGCTGT 


GTGA 


1854 


(219) INFORMATION FOR 


SEQ ID NO: 218: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 617 amino acids 

(B) TYPE: amino acid 
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( C ) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLEOJLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:218: 

Met Gly Pro Thr Leu Ala Val Pro Thr Pro Tyr Gly Cys He Gly Cys 
5 .10 15 

Lys Leu Pro Gin Pro Glu Tyr Pro Pro Ala Leu He He Phe Met Phe 
20 25 30 

cys Ala Met Val He Thr He Val Val Asp Leu He Gly Asn Ser Met 
35 40 45 

val He Leu Ala Val Thr Lys Asn Lys Lys Leu Arg Asn Ser Gly Asn 
^° 55 60 

He Phe Val Val Ser Leu Ser Val Ala Asp Met Leu Val Ala He Tyr 

75 80 

Pro Tyr Pro. Leu Met Leu His Ala Met Ser He Gly Gly Trp Asp lieu 
85 90 95 

Ser Gin Leu Gin Cys Gin Met Val Gly Phe He Thr Gly Leu Ser Val 
. 105 

Val Gly ser He Phe Asn He Val Ala He Ala He Asn Arg Tyr Cys 
115 120 125 

Tyr He Cys His Ser Leu Gin Tyr Glu Arg He Phe Ser Val Arg Asn 
130 135 140 

Thr Cys He Tyr Leu Val He Thr Trp He Met Thr Val Leu Ala Val 
" 150 155 ISO 

Leu Pro Asn Met Tyr He Gly Thr He Glu Tyr Asp Pro Arg Thr Tyr 
165 170 175 

Thr Cys He Phe Asn Tyr Leu Asn Asn Pro Val Phe Thr Val Thr He 
180 185 190 

val Cys He His Phe Val Leu Pro Leu Leu He Val Gly Phe Cys Tyr 

200 205 

val Arg He Trp Thr Lys Val Leu Ala Ala Arg Asp Pro Ala Gly Gin 

215 220 

Asn Pro ASP Asn Gin Leu Ala Glu Val Arg Asn Lys Leu Thr Met Phe 

235 240 

val He Phe Leu Leu Phe Ala Val Cys Trp Cys Pro He Asn Val Leu 
245 250 255 



181 



Thr Val Leu Val Ala Val Ser Pro Lya Glu Met Ala Gly Lys lie Pro 
260 265 270 

Asn Trp Leu Tyr Leu Ala Ala Tyr Phe lie Ala Tyr Phe Asn Ser Cys 
275 280 285 

Leu Asn Ala Val lie Tyr Gly Leu Leu Asn Glu Asn Phe Arg Arg Glu 
290 295 300 

Tyr Trp Thr He Phe His Ala Met Arg His Pro He He Phe Phe Ser 
305 310 315 320 

Gly Leu He Ser Asp He Arg Glu Met Gin Glu Ala Arg Thr Leu Ala 
325 330 335 

Arg Ala Arg Ala His Ala Arg Asp Gin Ala Arg Glu Gin Asp Arg Ala 
340 345 350 

His Ala Cys Pro Ala Val Glu Glu Thr Pro Met Asn Val Arg Asn Val 
355 360 365 

Pro Leu Pro Gly Asp Ala Ala Ala Gly His Pro Asp Arg Ala Ser Gly 
370 375 380 

His Pro Lys Pro His Ser Arg Ser Ser Ser Ala Tyr Arg Lys Ser Ala 
385 390 395 400 

Ser Thr His His Lys Ser Val Phe Ser His Ser Lys Ala Ala Ser Gly 
405 410 415 

His Leu Lys Pro Val Ser Gly His Ser Lys Pro Ala Ser Gly His Pro 
420 425 430 

Lys Ser Ala Thr Val Tyr Pro Lys Pro Ala Ser Val His Phe Lys Ala 
435 440 445 

Asp Ser Val His Phe Lys Gly Asp Ser Val His Phe Lys Pro Asp Ser 
450 455 460 

Val His Phe Lys Pro Ala Ser Ser Asn Pro Lys Pro He Thr Gly His 
465 . 470 475 480 

His Val Ser Ala Gly Ser His Ser Lys Ser Ala Phe Ser Ala Ala Thr 
485 490 495 

Ser His Pro Lys Pro Thr Thr Gly His He Lys Pro Ala Thr Ser His 
500 505 510 

Ala Glu Pro Thr Thr Ala Asp Tyr Pro Lys Pro Ala Thr Thr Ser His 
515 520 525 

Pro Lys Pro Thr Ala Ala Asp Asn Pro Glu Leu Ser Ala Ser His Cys 
530 535 540 



Pro Glu He Pro Ala He Ala His Pro Val Ser Asp Asp Ser Asp Leu 
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545 



555 560 



Pro Glu Ser Ala Ser Ser Pro Ala Ala Gly Pro Thr Lys Pro Ala Ala 
565 570 

ser Gin Leu Glu Ser Asp Thr He Ala Asp Leu Pro Asp Pro Thr Val 
580 585 590 

val Thr Thr Ser Thr Asn Asp Tyr His Asp Val Val Val val Asp Val 
595 600 gQ5 

Glu Asp Asp Pro Asp Glu Met Ala Val 
610 615 

(220) INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1548 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECOLE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 
ATGGGACATA ACGG6AGCTG 6ATCTCTCCA AATGCCAGCG AGCCGCACAA CGCGTCCGGC 
OCCQAGGCTG CGGGTGTGAA CCGCAGCGCG CTCGGGGAGT TCGGCGAGGC GCAGCTGTAC 
CGCCAGTTCA CCACCACCGT GCAGGTCGTC ATCTTCATAG GCTCGCTGCT CGGAAACTTC 
ATGGTGTTAT GGTCAACTTG CCGCACAACC GTGTTCAAAT CTGTCACCAA CAGGTTCATT 
AAAAACCTGG CCT6CTCGGG QATTTGTGCC AGCCTGGTCT GTGTGCCCTT CGACATCATC 
CTCAGCACCA GTCCTCACTG TTGCTGGTGG ATCTACACCA TGCTCTTCTG CAAGGTCGTC 
AAATTTTTGC ACAAAGTATT CTGCTCTGTG ACCATCCTCA GCTTCCCTCC TATTGCTTTG 
GACAGGTACT ACTCAGTCCT CTATCCACTG GAGAGGAAAA TATCTGATGC CAAGTCCCGT 
GAACTGQTGA TGTACATCTG GGCCCATGCA GTGGTGGCCA GTGTCCCTGT GTTTGCAGTA 
ACCAATGTGG CTGACATCTA TGCCACGTCC ACCTGCACGG AAGTCTCGA6 CAACTCCTTG 
GGCCACCTGG TGTACGTTCT GGTGTATAAC ATCACCACGG TCATTGTGCC TGTGGTC6TG 
GTQTTCCTCT TCTTGATACT GATCCGACG6 GCCCTGAGTG CCAGCCAGAA GAAGAAGGTC 
ATCATAGCAG CGCTCCGGAC CCCACAGAAC ACCATCTCTA TTCCCTATGC CTCCCAGCGG 
GAGGCCGAGC TGAAAGCCAC CCTGCTCTCC ATGGTGATCG TCTTCATCTT GTGTAGC6TG 
CCCTATGCCA CCCTG6TC6T CTACCAGACT GTGCTCAATG TCCCTGACAC TTCCGTCTTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
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TTGCTGCTCA CTGCTGTTTG GCTGCCCAAA GTCTCCCTGC TGGCAAACCC TGTTCTCTTT 960 

CTTACTGTGA ACAAATCTGT CCGCAAGTGC TTGATAGGGA CCCTGGTGCA .ACTACACCAC 1020 

CGGTACAGTC GCCGTAATGT GGTCAGTACA GGGAGTGGCA TGGCTGAGGC CAGCCTGGAA 1080 

CCCAGCATAC GCTCGG6TAG CCAGCTCCTG GAGATGTTCC ACATTGGGCA GCAGCAGATC 114 0 

TTTAAGCCCA CAGAGGATGA GGAAGAGAGT GAGGCCAAGT ACATTGGCTC AGCTGACTTC 1200 

CAGGCCAAGG AGATATTTAG CACCTGCCTG GAGGGAGAGC AGGGGCCACA GTTTGCGCCC 1260 

TCTGCCCCAC CCCTGAGCAC AGTGGACTCT GTATCCCAGG TGGCACCGGC AGCCCCTGTG 1320 

GAACCTGAAA CATTCCCTGA TAAGTATTCC CTGCAGTTTG GCTTTGGGCC TTTTGAGTTG 1380 

CCTCCTCAGT GGCTCTCAGA GACCCGAT^C AGCAAGAAGC GGCTGCTTCC CCCCTTGGGC 1440 

AACACCCCAG AAGAGCTGAT CCAGACAAAG GTGCCCAAGG TAGGCAGGGT GGAGCGGAAG 1500 

ATGAGCAGAA ACAATAT^GT GAGCATTTTT CCAAAGGTGG ATTCCTAG 1548 
(221) INFORMATION FOR SEQ ID NO:220: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 515 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 

Met Gly His Asn Gly Ser Trp He Ser Pro Asn Ala Ser Glu Pro His 
1 5 . " 

Asn Ala Ser Gly Ala Glu Ala Ala Gly Val Asn Arg Ser Ala Leu Gly 
20 25 30 

Glu Phe Gly Glu Ala Gin Leu Tyr Arg Gin Phe Thr Thr Thr Val Gin 
35 40 45 

Val Val He Phe He Gly Ser Leu Leu Gly Asn Phe Met Val Leu Trp 
50 55 60 

Ser Thr Cys Arg Thr Thr Val Phe Lys Ser Val Thr Asn Arg Phe He 
^5 70 75 80 

Lys Asn Leu Ala Cys Ser Gly He Cys Ala Ser Leu Val Cys Val Pro 
85 90 95 

Phe Asp He He Leu Ser Thr Ser Pro His Cys Cys Trp Trp He Tyr 
100 105 110 
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Thr Met Leu Phe Cys Lys Val Val Lys Phe Leu His Lys Val Phe 



120 



125 



Cys 



Ser val Thr lie Leu Ser Phe Pro Ala He Ala Leu Asp Arg Tyr Tyr 

ser val Leu Tyr Pro Leu Glu Arg Lys lie Ser Asp Ala Lys Ser Arg 

160 

Glu Leu val Met Tyr He Trp Ala His Ala Val Val Ala Ser Val Pro 
"5 

val Phe Ala Val Thr Asn Val Ala Asp He Tyr Ala Thr Ser Thr Cys 

185 

Thr Glu Val Trp Ser Asn Ser Leu Gly His Leu Val Tyr Val Leu Val 
. 200 205 

Tyr Asn He Thr Thr Val He Val Pro Val Val Val Val Phe Leu Phe 

215 220 

He Leu He Arg Arg Ala Leu Ser Ala sor- 

225 



Leu He Leu He Arg Arg Ala Leu Ser Ala Ser Gin Lys Lys Lys Val 
.230 235 240 

He He Ala Ala Leu Arg Thr Pro Gin Asn Thr He Ser He Pro Tyr 

245 



250 



255 



Ala Ser Gin Arg Glu Ala Glu Leu Lys Ala Thr Leu Leu Ser 



260 265 



Met Val 



270 



Met val Phe He Leu Cys Ser Val Pro Tyr Ala Thr Leu Val Val Tyr 
275 280 285 

Gin Thr Val Leu Asn Val Pro Asp Thr Ser Val Phe Leu Leu Leu Thr 
Ala val Trp Leu Pro Lys Val Ser Leu Leu Ala Asn Pro Val Leu Phe 



310 



315 



320 



Leu Thr val Asn Lys Ser Val Arg Lys Cys Leu He Gly Thr Leu Val 

330 335 

Gin Leu His His Arg Tyr Ser Arg Arg Asn Val Val Ser Thr Gly Ser 

345 

Gly Met Ala Glu Ala Ser Leu Glu Pro Ser He Arg Ser Gly Ser 



355 360 



Gin 
365 



I^u Leu Glu Met Phe His He Gly Gin Gin Gin He Phe Lys Pro Thr 

^"^^ 380 

Glu ASP Glu Glu Glu ser Glu Ala Lys Tyr He Gly Ser' Ala Asp Phe 



395 



400 



Gin Ala Lys Glu lie Phe Ser Thr Cys Leu Glu Gly Glu Gin Gly 



Pro 
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405 410 415 

Gin Phe Ala Pro Ser Ala Pro Pro Leu Ser Thr Val Asp Ser Val Ser 
420 425 430 

Gin Val Ala Pro Ala Ala Pro Val Glu Pro Glu Thr Phe Pro Asp Lys 
435 440 445 

Tyr Ser Leu Gin Phe Gly Phe Gly Pro Phe Glu Leu Pro Pro Gin Trp 
450 455 460 

Leu Ser Glu Thr Arg Asn Ser Lys Lys Arg Leu Leu Pro Pro Leu Gly 
465 470 475 480 

Asn Thr Pro Glu Glu Leu lie Gin Thr Lys Val Pro Lys Val Gly Arg 
485 490 495 

Val Glu Arg Lys Met Ser Arg Asn Asn Lys Val Ser lie Phe Pro Lys 
500 505 510 

Val Asp Ser 
515 

(222) INFORMATION FOR SEQ ID NO: 221: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1164 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 

ATGAATCGGC ACCATCTGCA GGATCACTTT CTGGAT^TAG ACAAGAAGAA CTGCTGTGTG 60 

TTCCGAGATG ACTTCATTGC CAAGGTGTTG CCGCCGGTGT TGGGGCTGGA GTTTATCTTT 120 

GGGCTTCTGG GCAATGGCCT TGCCCTGTGG ATTTTCTGTT TCCACCTCAA GTCCTGGAAA 180 

TCCAGCCGGA TTTTCCTGTT CAACCTGGCA GTAGCTGACT TTCTACTGAT CATCTGCCTG 240 

CCGTTCGTGA TGGACTACTA TGTGCGGCGT TCAGACTGGA AGTTTGGGGA CATCCCTTGC 300 

CGGCTGGTGC TCTTCATGTT TGCCATGAAC CGCCAGGGCA GCATCATCTT CCTCACGGTG 360 

GTGGCGGTAG ACAGGTATTT CCGGGTGGTC CATCCCCACC ACGCCCTGAA CAAGATCTCC 420 

AATTGGACAG CAGCCATCAT CTCTTGCCTT CTGTGGGGCA TCACTGTTGG CCTAACAGTC 480 

CACCTCCTGA AGAAGAAGTT GCTGATCCAG AATGGCCCTG CAAATGTGTG CATCAGCTTC 540 

AGCATCTGCC ATACCTTCCG GTGGCACGAA GCTATGTTCC TCCTGGAGTT CCTCCTGCCC 600 
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CTGGGCATCA TCCTGTTCTG CTCAGCCAGA ATTATCTGGA GCCTCCGGCA GAGACAAATC 660 
GACCGGCATG CCAAGATCAA GAGAGCCAAA ACCTTCATCA TGGTGGTGGC CATCGTCTTT 720 
GTCATCTGCT TCCTTCCCAG CGTGGTT6TG CGGATCCGCA TCTTCTGGCT CCTGCACACT 780 
TCGGGCACGC AGAATTGTGA AGTGTACCGC TCGGTGGACC TGGCGTTCTT TATCACTCTC 840 
AGCTTCACCT ACATGAACAG CATGCTGGAC CCCG-TOGTCT ACTACTTCTC CAGCCCATCC 900 
TTTCCCAACT TCTTCTCCAC TTTGATCAAC CGCTRSCCTCC AGAGGAAGAT GACAGGTGAG 960 
CCAGATAATA ACCGCAGCAC GAGCGTCGAG CTCACAGGGG ACCCCAACAA AACCAGAGGC 1020 
GCTCCAGAGG CGTTAATGGC CAACTCCGGT GAGCCATGGA GCCCCTCTTA TCTCGGCCCA 1080 
ACCTCAAATA ACCATTCCAA GAAGGGACAT TGTCACCAAG AACCAGCATC TCTCGA6AAA 1140 
CAGTTGGGCT GTTGCATCGA GTAA 

1164 

(223) INFORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222; 

Met Asn Arg His His Leu Gin Asp His Phe Leu Glu He Asp Lys Lys 
^ 10 15 

Asn cys Cys Val Phe Arg Asp Asp Phe He Ala Lys Val Leu Pro Pro 

25 30 

val Leu Gly Leu Glu Phe He Phe Gly Leu Leu Gly Asn Gly Leu Ala 

40 45 

Leu Trp He Phe Cys Phe His Leu Lys Ser Trp Lys Ser Ser Arg He 

55 60 

Phe Leu Phe Asn Leu Ala Val Ala Asp Phe Leu Leu He He Cys Leu 

'^^ 80 

Pro Phe val Met Asp Tyr Tyr Val Arg Arg Ser Asp Trp Lys Phe Gly 
85 90 35 

Asp He Pro cys Arg Leu Val Leu Phe Met Phe Ala Met Asn Arg Gin 
100 105 

Gly ser lie He Phe Leu Thr Val Val Ala Val Asp Arg Tyr Phe Arg 



125 
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Val Val His Pro His His Ala Leu Asn Lys lie Ser Asn Trp Thr Ala 
130 135 140 

Ala lie lie Ser Cys Leu Leu Trp Gly lie Thr Val Gly Leu Thr Val 
145 150 155 160 

His Leu Leu Lys Lys Lys Leu Leu lie Gin Asn Gly Pro Ala Asn Val 
165 170 175 

Cys lie Ser Phe Ser lie Cys His Thr Phe Arg Trp His Glu Ala Met 
180 185 190 

Phe Leu Leu Glu Phe Leu Leu Pro Leu Gly He He Leu Phe Cys Ser 
195 200 205 

Ala Arg He He Trp Ser Leu Arg Gin Arg Gin Met Asp Arg His Ala 
210 215 220 

Lys He Lys Arg Ala Lys Thr Phe He Met Val Val Ala He Val Phe 
225 230 235 240 

Val He Cys Phe Leu Pro Ser Val Val Val Arg He Arg He Phe Trp 
245 250 255 

Leu Leu His Thr Ser Gly Thr Gin Asn Cys Glu Val Tyr Arg Ser Val 
260 265 270 

Asp Leu Ala Phe Phe He Thr Leu Ser Phe Thr Tyr Met Asn Ser Met 
275 280 285 

Leu Asp Pro Val Val Tyr Tyr Phe Ser Ser Pro Ser Phe Pro Asn Phe 
290 295 300 

Phe Ser Thr Leu He Asn Arg Cys Leu Gin Arg Lys Met Thr Gly Glu 
305 310 315 320 

Pro Asp Asn Asn Arg Ser Thr Ser Val Glu Leu Thr Gly Asp Pro Asn 
325 330 335 

Lys Thr Arg Gly Ala Pro Glu Ala Leu Met Ala Asn Ser Gly Glu Pro 
340 345- 350 

Trp Ser Pro Ser Tyr Leu Gly Pro Thr Ser Asn Asn His Ser Lys Lys 
355 360 365 

Gly His Cys His Gin Glu Pro Ala Ser Leu Glu Lys Gin Leu Gly Cys 
370 375 380 

Cys He Glu 
385 

(224) INFORMATION FOR SEQ ID NO:223: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1212 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: not relevant 

(ii) MOLECXJLE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:223: 
ATGGCTTGCA ATGGCAGTGC GGCCAGGGGG CACTTTGACC CTGA6GACTT GAACCTGACT 60 
GACGAG6CAC TGAGACTCAA GTACCTQGGG CCCCAGCAGA CAGAGCTGTT CATGCCCATC 120 

TGTGCCACAT ACCTGCT6AT CTTCGTGGTG GGCGCTGTGG GCAATGGGCT GACCTGTCTG 180 

GTCATCCTGC GCCACAAGGC CATGCGCACG CCTACCAACT ACTACCTCTT CAGCCTGGCC 240 

6TGTCGGACC TGCTGGTGCT GCTGGTGGGC CTGCCCCTCG AGCTCTATGA GATGTGQCAC 300 

AACTACCCCT TCCTGCTGGG CGTTGQT6GC TGCTATTTCC GCACGCTACT GTTTGAGATG 360 

GTCTGCCTGG CCTCAGTGCT CAACGTCACT GCCCTGAGCG TGGAACGCTA TGTGGCCGTG 420 

GTGCACCCAC TCCAGGCCAG GTCCATGGTG ACGCGGGCCC AT6TGCGCCG AGTGCTTGGG 480 

GCCGTCTGGG GTCTTGCCAT GCTCTGCTCC CTGCCCAACA CCAGCCTGCA CGQCATCCGG 540 

CAGCTGCACG TGCCCTGCCG GGGCCCAGTG CCAGACTCAG CTGTTTGCAT GCTGGTCCGC 600 

CCACGGGCCC TCTACAACAT GGTAGTGCAG ACCACCGCGC TGCTCTTCTT CTGCCTGCCC 660 

ATGGCCATCA TGAGCGTGCT CTACCTGCTC ATTGGGCTGC GACTCCGGCG GGAGAGGCTG 720 

CTGCTCATGC AGGAGGCCAA GGGCAGGGGC TCTCCAGCAG CCAGGTCCAG ATACACCTGC 780 

AGGCTCCAGC AGCACGATCG GGGCCGGAGA CAAGTGAAGA AGATGCTGTT TGTCCTGGTC 840 

GTGGTGTTTG GCATCTGCTG GGCCCCGTTC CACGCCGACC GCGTCATGTG GAGCGTCGTG 900 

TCACAGTGGA CAGATGGCCT GCACCTGGCC TTCCAGCACG TGCACGTCAT CTCCGGCATC 960 

TTCTTCTACC TGGGCTCGGC GGCCAACCCC GTCCTCTATA GCCTCATGTC CAGCCGCTTC 1020 

CGAGAGACCT TCCAGGAGGC CCTGTGCCTC GGGGCCTGCT GCCATCGCCT CAGACCCCGC 1080 ^ 

CACAGCTCCC ACAGCCTCAG CAGGATGACC ACAGGCAGCA CCCTGTGTGA TGTGGGCTCC 1140 

CTGGGCAGCT GGGTCCACCC CCTGGCTGGG AACGATGGCC CAGAGGCGCA GCAAGAGACC 1200 
GATCCATCCT GA 

1212 

(225) INFORMATION FOR SEQ ID NO: 224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 403 amino acids 

(B) TYPE: amino acid 



wo 00/22129 



PCt/US99/23938 



189 

(C) STRANDEDNESS ; 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 

Met Ala Cys Asn Gly Ser Ala Ala Arg Gly His Phe Asp Pro Glu Asp 
1 5 10 15 

Leu Asn Leu Thr Asp Glu Ala Leu Arg Leu Lys Tyr Leu Gly Pro Gin 
20 25 30 

Gin Thr Glu Leu Phe Met Pro He Cys Ala Thr Tyr Leu Leu He Phe 

35 . . 40 45 

Val Val Gly TQa Val Gly Asn Gly Leu Thr Cys Leu Val He Leu Arg 
50 55 60 

His Lys Ala Met Arg Thr Pro Thr Asn Tyr Tyr Leu Phe Ser Leu Ala 
65 70 75 80 

Val Ser Asp Leu Leu Val Leu Leu Val Gly Leu Pro Leu Glu Leu Tyr 
85 90 95 

Glu Met Trp His Asn Tyr Pro Phe Leu Leu Gly Val Gly Gly Cys Tyr 
100 105 110 

Phe Arg Thr Leu Leu Phe Glu Met Val Cys Leu Ala Ser Val Leu Asn 
115 120 125 

Val Thr Ala Leu Ser Val Glu Arg Tyr Val Ala Val Val His Pro Leu 
130 135 140 

Gin Ala Arg Ser Met Val Thr Arg Ala His Val Arg Arg Val Leu Gly 
145 150 155 160 

Ala Val Trp Gly Leu Ala Met Leu Cys Ser Leu Pro Asn Thr Ser Leu 
165 170 175 

His Gly He Arg Gin Leu His Val Pro Cys Arg Gly Pro Val Pro Asp 
180 185 190 

Ser Ala Val Cys Met Leu Val Arg Pro Arg Ala Leu Tyr Asn Met Val 
195 200 205 

Val Gin Thr Thr Ala Leu Leu Phe Phe Cys Leu Pro Met Ala He Met 
210 215 220 

Ser Val Leu Tyr Leu Leu He Gly Leu Arg Leu Arg Arg Glu Arg Leu 
225 230 235 240 

Leu Leu Met Gin Glu Ala Lys Gly Arg Gly Ser Ala Ala Ala Arg Ser 
245 250 255 



5 
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Arg Tyr Thr Cys Arg Leu Gin Gin His Asp Arg Gly Arg Arg Gin Val 
260 265 270 

Lys Lys Met Leu Phe Val Leu Val Val Val Phe Gly He Cys Trp Ala 
275 280 285 

Pro Phe His Ala Asp Arg Val Met Trp Ser Val Val Ser Gin Trp Thr 

295 300 

Asp Gly Leu His Leu Ala Phe Gin His Val His Val He Ser Gly He 

310 315 

Phe Phe Tyr Leu Gly Ser Ala Ala Asn Pro Val Leu Tyr Ser Leu Met 
325 330 

ser Ser Arg Phe Arg Glu Thr Phe Gin Glu Ala Leu Cys Leu Gly Ala 
340 345 

Cys Cys His Arg Leu Arg Pro Arg His Ser Ser His Ser Leu Ser Arg 
355 360 365 



Met Thr Thr Gly Ser Thr Leu Cys l^p Val Gly Ser Leu Gly Ser Trp 

375 380 

Val His Pro Leu Ala Gly Asn Asp Gly Pro Glu Ala Gin Gin Glu Thr 

385 oort _ — 

Asp Pro Ser 



390 395 400 



(226) INFORMATION FOR SEQ ID NO:225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1098 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:225: 



ATGGGGAACA TCACTGCAGA 


CAACTCCTCG 


ATGAGCTGTA CCATCGACCA 


TACCATCCAC 


60 


CAGACGCTGG CCCCGGTGGT 


CTATGTTACC 


GTGCTGGTGG TGGGCTTCCC 


GGCCAACTGC 


120 


CTGTCCCTCT ACTTCGGCTA 


CCTGCAGATC 


AAGGCCCGGA ACGAGCTGGG 


CGTGTACCTG 


180 


TGCAACCTGA CGGTGGCCGA 


CCTCTTCTAC 


ATCTGCTCGC TGCCCTTCTG 


GCTGCAGTAC 


240 


GTGCTGCAGC ACGACAACTG 


GTCTCACGGC 


GACCTGTCCT GCCAGGTGTG 


CGGCATCCTC 


300 


CTGTACGAGA ACATCTACAT 


CAGCGTGGGC 


TTCCTCTGCT GCATCTCCGT 


GGACCGCTAC 


360 


CTGGCTGTGG CCCATCCCTT 


CCGCTTCCAC 


CAGTTCCGGA CCCTGAAGGC 


GGCCGTCGGC 


420 
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GTCAGCGTGG TCATCTGGGC CAAGGAGCTG CTGACCAGCA TCTACTTCCT GATGCACGAG 480 

GAGGTCATCG AGGACGAGAA CCAGCTICCGC GTGTGCTTTG AGCACTACCC CATCCAGGCA 540 

TGGCAGCGCG CCATCAACTA CTACCGCTTC CTGGTGGGCT TCCTCTTCCC CATCTGCCTG 600 

CTGCTGGCGT CCTACCAGGG CATCCTGCGC GCCGTGCGCC GGAGCCACGG CACCCAGAAG 660 

AGCCGCAAGG ACCAGATCAA GCGGCTGGTG CTCAGCACCG TGGTCATCTT CCTGGCCTGC 720 

TTCCTGCCCT ACCACGTGTT GCTGCTGGTG CGCAGCGTCT GGGAGGCCAG CTGCGACTTC 780 

GCCAAGGGCG TTTTCAACGC CTACCACTTC TCCCTCCTGC TCACCAGCTT CAACTGCGTC 840 

GCCGACCCCG TGCTCTACTG CTTCGTCAGC GAGACCACCC ACCGGGACCT GGCCCGCCTC 900 

CGCGGGGCCT GCCTGGCCTT CCTCACCTGC TCCAGGACCG GCCGGGCCAG GGAGGCCTAC 960 

CCGCTGGGTG CCCCCGAGGC CTCCGGGAAA AGCGGGGCCC AGGGTGAGGA GCCCGAGCTG 1020 

TTGACCAAGC TCCACCCGGC CTTCCAGACC CCTAACTCGC CAGGGTCGGG CGGGTTCCCC 1080 

ACGGGCAGGT TGGCCTAG 1098 
(227) INFORMATION FOR SEQ ID NO: 226: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 365 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE* TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:226: 

Met Gly Asn lie Thr Ala Asp Asn Ser Ser Met Ser Cys Thr lie Asp 
1-5 10 15 

His Thr lie His Gin Thr Leu Ala Pro Val Val Tyr Val Thr Val Leu 
20 25 30 

Val Val Gly Phe Pro Ala Asn Cys Leu Ser Leu Tyr Phe Gly Tyr Leu 
35 40 45 

Gin lie Lys Ala Arg Asn Glu Leu Gly Val Tyr Leu Cys Asn Leu Thr 
50 55 60 

Val Ala Asp Leu Phe Tyr He Cys Ser Leu Pro Phe Trp Leu Gin Tyr 
€5 70 75 80 

Val Leu Gin His Asp Asn Trp Ser His Gly Asp Leu Ser Cys Gin Val 
85 90 95 

Cys Gly He Leu Leu Tyr Glu Asn He Tyr He Ser Val Gly Phe Leu 
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105 

Cys Cys He Ser Val Asp Arg Tyr Leu Ala Val Ala His Pro Phe Arq 
lis 120 125 

Phe His Gin Phe Arg Thr Leu Lys Ala Ala Val Gly Val Ser Val Val 

130 

lie Trp Ala Lys Glu Leu Leu Thr Ser He Tyr Phe Leu Met His Glu 



145 150 155 



160 



Glu Val He Glu Asp Glu Asn Gin His Arg Val Cys Phe Glu His Tyr 



165 170 



175 



Val 



Pro He Gin Ala Trp Gin Arg Ala He Asn Tyr Tyr Arg Phe Leu 
180 185 

Gly Phe Leu Phe Pro He Cys Leu Leu Leu Ala Ser Tyr Gin Gly He 
195 200 205 

Leu Arg Ala Val Arg Arg Ser His Gly Thr Gin Lys Ser Arg Lys Asp 
210 215 220 

Gin He Lys Arg Leu Val Leu Ser Thr Val Val He Phe Leu Ala Cys 

230 235 240 

Phe Leu Pro Tyr His Val Leu Leu Leu Val Arg Ser Val Trp Glu Ala 
245 250 255 

Ser Cys Asp Phe Ala Lys Gly Val Phe Asn Ala Tyr His Phe Ser Leu 
260 265 270 

Leu Leu Thr Ser Phe Asn Cys Val Ala Asp Pro Val Leu Tyr Cys Phe 
275 280 285 

Val Ser Glu Thr Thr His Arg Asp Leu Ala Arg Leu Arg Gly Ala Cys 
290 295 300 

Leu Ala Phe Leu Thr Cys Ser Arg Thr Gly Arg Ala Arg Glu Ala Tyr 

310 315 

Pro Leu Gly Ala Pro Glu Ala Ser Gly Lys Ser Gly Ala Gin Gly Glu 
325 330 

Glu Pro Glu Leu Leu Thr Lys Leu His Pro Ala Phe Gin Thr Pro Asn 
340 345 

Ser Pro Gly Ser Gly Gly Phe Pro Thr Gly Arg Leu Ala 
355 360 365 

(228) INFORMATION FOR SEQ ID NO: 227: 

(i) SEQDENCE CHARACTERISTICS: 

(A) LENGTH: 1416 base pairs 

(B) TYPE: nucleic acid 



wo 00/22129 PCT/US99/23938 

193 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227: 

ATGGATATTC TTTGTGAAGA AAATACTTCT TTGAGCTCAA CTACGAACTC CCTAATGCAA 60 

TTAAATGATG ACAACAGGCT CTACAGTAAT GACTTTAACT CCGGAGAAGC TAACACTTCT 120 

GATGCATTTA ACTGGACAGT CGACTCTGAA AATCGAACCA ACCTTTCCTG TGAAGGGTGC 180 

CTCTCACCGT CGTGTCTCTC CTTACTTCAT CTCCAGGAAA AA;\ACTGGTC TGCTTTACTG 240 

ACAGCCGTAG TGATTATTCT AACTATTGCT GGAAACATAC TCGTCATCAT GGCAGTGTCC 300 

CTAGAGAAAA AGCTGCAGAA TGCCACCAAC TATTTCCTGA TGTCACTTGC CATAGCTGAT 360 

ATGCTGCTGG GTTTCCTTGT CATGCCCGTG TCCATGTTAA CCATCCTGTA TGGGTACCGG 420 

TGGCCTCTGC CGAGCAAGCT TTGTGCAGTC TGGATTTACC TGGACGTGCT CTTCTCCACG 480 

GCCTCCATCA TGCACCTCTG CGCCATCTCG CTGGACCGCT ACGTCGCCAT CCAGAATCCC 540 

ATCCACCACA GCCGCTTCAA CTCCAGAACT AAGGCATTTC TGAAAATCAT TGCTGTTTGG 600 

ACCATATCAG TAGGTATATC CATGCCAATA CCAGTCTTTG GGCTACAGGA CGATTCGAAG 660 

GTCTTTAAGG AGGGGAGTTG CTTACTCGCC GATGATAACT TTGTCCTGAT CGGCTCTTTT 720 

GTGTCATTTT TCATTCCCTT AACCATCATG GTGATCACCT ACTTTCTAAC TATCAAGTCA 780 

CTCCAGAAAG AAGCTACTTT GTGTGTAAGT GATCTTGGCA CACGGGCCAA ATTAGCTTCT 840 

TTCAGCTTCC TCCCTCAGAG TTCTTTGTCT TCAGAAAAGC TCTTCCAGCG GTCGATCCAT 900 

AGGGAGCCAG GGTCCTACAC AGGCAGGAGG ACTATGCAGT CCATCAGCAA TGAGCAAAAG 960 

GCAAAGAAGG TGCTGGGCAT CGTCTTCTTC CTGTTTGTGG TGATGTGGTG CCCTTTCTTC 1020 

ATCACAAACA TCATGGCCGT CATCTGCAAA GAGTCCTGCA ATGAGGATGT CATTGGGGCC 1080 

CTGCTCAATG TGTTTGTTTG GATCGGTTAT CTCTCTTCAG CAGTCAACCC ACTAGTCTAC 1140 

ACACTGTTCA ACAAGACCTA TAGGTCAGCC TTTTCACGGT ATATTCAGTG TCAGTACAAG 1200 

GAAAACAAAA AACCATTGCA GTTAATTTTA GTGAACACAA TACCGGCTTT GGCCTACAAG 1260 

TCTAGCCAAC TTCAAATGGG ACAAAAAAAG AATTCAAAGC AAGATGCCAA GACAACAGAT 1320 

AATGACTGCT CAATGGTTGC TCTAGGAAAG CAGTATTCTG AAGAGGCTTC TAAAGACTU^T 1380 

AGCGACGGAG TGAATGAAAA GGTGAGCTGT GTGTGA 1416 
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(229) INFORMATION FOR SEQ ID NO: 228: * 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 228: 

^ Met Asp lie Leu Cys Glu Glu Asn Thr Ser Leu Ser Ser Thr Thr Asn 

-10 15 
ser Leu Met Gin Leu Asn Asp Asp Asn Arg Leu Tyr Ser Asn Asp Phe 



20 oc 

25 30 



Asn ser Gly Glu Ala Asn Thr Ser Asp Ala Phe Asn Trp Thr Val Asp 

40 45 

Ser Glu Asn Arg Thr Asn Leu Ser Cys Glu Gly Cys Leu Ser Pro Ser 



55 60 



cys Leu ser Leu Leu His Leu Gin Glu Lys Asn Trp Ser Ala Leu Leu 



75 



80 



Thr Ala val Val He He Leu Thr He Ala Gly Asn He Leu Val He 
85 90 95 



Met Ala val Ser Leu Glu Lys Lys Leu Gin Asn Ala Thr Asn Tyr Phe 

105 

Leu Met ser Leu Ala He Al^ Asp Met Leu Leu Gly Phe Leu Val Met 
"5 120 125 

Pro val ser Met Leu Thr He Leu Tyr Gly Tyr Arg Trp Pro Leu 



"5 140 



Pro 



Ser Lys Leu Cys Ala Val Trp He Tyr Leu Asp Val Leu Phe Ser Kn: 

■^^^ 150 



155 



160 



Ala ser He Met His Leu Cys Ala He Ser Leu Asp Arg Tyr 



170 



Val Ala 
175 



lie Gin Asn Pro He His His Ser Arg Phe Asn Ser Arg Thr Lys Ala 

185 -i^9Q 

Phe Leu Lys He He Ala Val Trp Thr He Ser Val Gly He Ser Met 

200 205 

Pro He Pro Val Phe Gly Leu Gin Asp Asp Ser Lys Val Phe Lys Glu 

220 

Gly ser cys Leu Leu Ala Asp Asp Asn Phe Val Leu He Gly Ser Phe 



235 



240 
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Val Ser Phe Phe He Pro Leu Thr He Met Val lie Thr Tyr Phe Leu 
245 250 255 

Thr lie Lys Ser Leu Gin Lys Glu Ala Thr Leu Cys Vcl Ser Asp Leu 
260 265 270 

5 Gly Thr Arg Ala Lys Leu Ala Ser Phe Ser Phe Leu Pro Gin Ser Ser 

275 280 285 

Leu Ser Ser Glu Lys Leu Phe Gin Arg Ser He His Arg Glu Pro Gly 
290 295 300 

Ser Tyr Thr Gly Arg Arg Thr Met Gin Ser He Ser Asn Glu Gin Lys 
10 305 310 315 320 

Ala Lys Lys Val Leu Gly He Val Phe Phe Leu Phe Val Val Met Trp 
325 330 335 

Cys Pro Phe Phe He Thr Asn He Met Ala Val He Cys Lys Glu Ser 
340 345 350 

15 Cys Asn Glu Asp Val He Gly Ala Leu Leu Asn Val Phe Val Trp He 

355 360 365 

Gly Tyr Leu Ser Ser Ala Val Asn Pro Leu Val Tyr Thr Leu Phe Asn 
370 375 380 

Lys Thr Tyr Arg Ser Ala Phe Ser Arg Tyr He Gin Cys Gin Tyr Lys 
20 385 390 395 400 

Glu Asn Lys Lys Pro Leu Gin Leu He Leu Val Asn Thr He Pro Ala 
405 410 415 

Leu Ala Tyr Lys Ser Ser Gin Leu Gin Met Gly Gin Lys Lys Asn Ser 
420 425 430 

25 Lys Gin Asp Ala Lys Thr Thr Asp Asn Asp Cys Ser Met Val Ala Leu 

.435 440 445 

Gly Lys Gin Tyr Ser . Glu Glu Ala Ser Lys Asp Asn Ser Asp Gly Val 
450 455 460 

Asn Glu Lys Val Ser Cys Val 
30 465 470 

(230) INFORMATION FOR SEQ ID NO:229: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1377 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



<ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:229: 
ATGGTGAACC TGAGGAATGC GGTGCATTCA TTCCTTGTCC ACCTAATTGG CCTATTGGTT 60 
TGGCAATGTG ATATTTCTGT GAGCCCAGTA GCAGCTATAG TAACTCACAT TTTCAATACC 120 
TCCGATGGTG GACGCTTCAA ATTCCCAGAC GGGGTACAAA ACTGGCCAGC ACTTTCAATC 180 
5 6TCATCATAA TAATCATGAC AATAGGTCGC AACATCCTTG TGATCATGGC AGTAAGCATG 240 
GAAAAGAAAC TGCACAAT6C CACCAATTAC TTCTTAATGT CCCTAGCCAT TGCTGATATG 300 
CTAGTGGGAC TACITCTCAT GCCCCTGTCT CTCCTGGCAA TCCTITATGA TTATGTCTGG 360 
CCACTACCTA GATATTTGTG CCCCGTCTGG ATTTCTTTAG ATGTTTTATT TTCAACAGCG 420 
TCCATCATGC ACCTCTGCGC TATATCGCTG GATCGGTATG TAGCAATACG TAATCCTATT 480 
0 GAGCATA6CC GTTTCAATTC GCGGACTAAG GCCATCATGA AGATT6CTAT TOTTTGGGCA 540 
ATTTCTATAG GTGTATCAGT TCCTATCCCT GTCATTCGAC TCAGGGACGA AGAAAAGGTG 600 
TTCGTGAACA ACACGACGTG CGTGCTCAAC GACCCAAATT TCGTTCTTAT TGGGTCCTTC 660 
GTAGCTTTCT TCATACCGCT GACGATTATG GTGATTACGT ATTGCCTGAC CATCTACGTT 720 
CTGCGCCGAC AAGCTTTGAT GTTACTGCAC GGCCACACCG AGGAACCGCC TGGACTAAGT 780 
CTGGATTTCC TGAAGTGCTG CAAGAG6AAT ACGGCCGAGG AAGAGAACTC TGCAAACCCT 840 
AACCAAGACC AGAACGCACG CCGAAGAAAG AAGAAGGAGA GACGTCCTAG GGGCACCATG 900 
CAGGCTATCA ACAATGAAAG AAAAGCTAAG AAAGTCCTTC GGATTGTTTT CTTTGTGTTT 960 
CTGATCATGT GGTGCCCATT TTTCATTACC AATATTCTCT CTGTTCTTTG TGAGAAGTCC 1020 
TGTAACCAAA AGCTCATGGA AAAGCTTCTG AATGTGTTTG TTTGGATTGG CTATGTTTGT 1080 
TCAGGAATCA ATCCTCTGGT GTATACTCTG TTCAACAAAA TTTACCGAAG GGCATTCTCC 1140 
AACTATTTGC GTTGCAATTA TAAGGTAGAO AAAAAGCCTC CTGTCAGGCA GATTCCAAGA 1200 
GTTGCCGCCA CTGCTTTGTC TGGGAGGGAG CTTAATGTTA ACATTTATCG GCATACCAAT 1260 
GAACCGGTGA TCGAGAAAGC CAGTGACAAT GAGCCCGGTA TAGAGATGCA AGTTGAGAAT 1320 
TTAGAGTTAC CAGTAAATCC CTCCAGTGTG GTTAGCGAAA GGATTAGCA6 TGTGTGA 1377 
(231) INFORMATION FOR SEQ ID NO:230: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 458 amino acids 

(B) TYPE: amino acid 

(C) STRANDBDNESS : 
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(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:230: 

Met Val Asn Leu Arg Asn Ala Val His Ser Phe Leu Val His Leu lie 
5 1 5 10 15 

Gly Leu Leu Val Trp Gin Cys Asp He Ser Val Ser Pro Val Ala Ala 
20 25 30 

He Val Thr Asp He Phe Asn Thr Ser Asp Gly Gly Arg Phe Lys Phe 
35 40 45 . 

10 Pro Asp Gly Val Gin Asn Trp Pro Ala Leu Ser He Val He He He 

50 55 60 

He Met Thr He Gly Gly Asn He Leu Val He Met Ala Val Ser Met 
65 70 75 80 

Glu Lys Lys Leu His Asn Ala Thr Asn Tyr Phe Leu Met Ser Leu Ala 
15 85 90 95 

He Ala Asp Met Leu Val Gly Leu Leu Val Met Pro Leu Ser Leu Leu 
100 105 110 

Ala He Leu Tyr Asp Tyr Val Trp Pro Leu Pro Arg Tyr Leu Cys Pro 
115 120 125 

20 Val Trp He Ser Leu Asp Val Leu Phe Ser Thr Ala Ser He Met His 

130 135 140 

Leu Cys Ala He Ser Leu T^p Arg Tyr Val Ala He Arg Asn Pro He 
145 150 155 160 

Glu His Ser Arg Phe Asn Ser Arg Thr Lys Ala He Met Lys He Ala 
25 165 170 175 

He Val Trp Ala He Ser He Gly Val Ser Val Pro He Pro Val He 
180 185 190 

Gly Leu Arg Asp Glu Glu Lys Val Phe Val Asn Asn Thr Thr Cys Val 
195 200 205 

30 Leu Asn Asp Pro Asn Phe Val Leu He Gly Ser Phe Val Ala Phe Phe 

210 215 220 

He Pro Leu Thr He Met Val He Thr Tyr Cys Leu Thr He Tyr Val 
225 230 235 240 

Leu Arg Arg Gin Ala Leu Met Leu Leu His Gly His Thr Glu Glu Pro 
35 245 250 255 



Pro Gly Leu Ser Leu Asp Phe Leu Lys Cys Cys Lys Arg Asn Thr Ala 



10 



15 



20 



25 



WO 00/22129 

ymsjwu^iiv PCTAJS99/23938 

198 

260 265 270 

Glu Glu Qlu Asn Ser Ala Asn Pro Asn Gin Asp Gin Asn Ala Arg Atq 
275 280 285 

Arg Lys Lys Lys Glu Arg Arg Pro Arg Gly Thr Met Gin Ala He Asn 

295 300 

Asn Glu Arg Lys Ala Lys Lys Val Leu Gly He Val Phe Phe Val Phe 

310 

Leu He Met Trp Cys Pro Phe Phe He Thr Asn He Leu Ser Val Leu 
325 330 

cys Glu Lys Ser Cys Asn Gin Lys Leu Met Glu Lys Leu Leu Asn Val 

345 350 

Phe Val Trp He Gly Tyr Val Cys Ser Gly He Asn Pro Leu Val Tyr 
355 360 365 

Thr Leu Phe Asn Lys He Tyr Arg Arg Ala Phe. Ser Asn Tyr Leu Arg 
"° 375 380 

Cys Asn Tyr Lys Val Glu Lys Lys Pro Pro Val Arg Gin He Pro Arg 

"0 395 400 

val Ala Ala Thr Ala Leu Ser Gly Arg Glu Leu Asn Val Asn He Tyr 
405 410 43^5 

Arg His Thr Asn Glu Pro Val He Glu Lys Ala Ser Asp Asn Glu Pro 

425 430 

Gly He Glu Met Gin Val Glu Asn Leu Glu Leu Pro Val Asn Pro Ser 
435 440 

Ser Val Val Ser Glu Arg He Ser Ser Val 
450 455 

(232) INFORMATION FOR SEQ ID NO: 231: 



445 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1068 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 231: 

ATGGATCAGT TCCCT^TC AGTGACAGAA AACTTTCAGT AC6ATGATTT GGCTGAGGCC 60 

35 TGTTATATTG GGGACATCGT GGTCTTTGGG ACTGTGTTCC TGTCCATATT CTACTCCGTC 120 

ATCTTTGCCA TTCGCCTGGT GGGAAATTO3 TTOGTAGTGT TTGCCCTCAC CAACAGCAAG 180 
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AAGCCCAAGA GTGTCACCGA CATTTACCTC CTGAACCTGG CCTTGTCTGA TCTGCTGTTT 240 

GTAGCCACTT TGCCCTTCTG GACTCACtAT TTGATAAATG AAAAGGGCCT CCACAATGCC 300 

ATGTGCAAAT TCACTACCGC CTTCTTCTTC ATCGGCTTTT TTGGAAGCAT ATTCTTCATC 360 

ACCGTCATCA GCATTGATAG GTACCTGGCC ATCGTCCTGG CCGCCAACTC CATGAACAAC 420 

CGGACCGTGC AGCATGGCGT CACCATCAGC CTAGGCGTCT GGGCAGCAGC CATTTTGGTG 480 

GCAGCACCCC AGTTCATGTT CACAAAGCAG AAAGAAAATG AATGCCTTGG TGACTACCCC 540 

GAGGTCCTCC AGGAAATCTG GCCCGTGCTC CGCAATGTGG AAACAAATTT TCTTGGCTTC 600 

CTACTCCCCC TGCTCATTAT GAGTTATTGC TACTTCAGAA TCATCCAGAC GCTGTTTTCC 660 

TGCAAGAACC ACAAGAAAGC CAAAGCCAAG AAACTGATCC TTCTGGTGGT CATCGTGTTT 720 

TTCCTCTTCT GGACACCCTA CAACGTTATG ATTTTCCTGG AGACGCTTAA GCTCTATGAC 780 

TTCTTTCCCA GTTGTGACAT GAGGAAGGAT CTGAGGCTGG CCCTCAGTGT GACTGAGACG 840 

GTTGCATTTA GCCATTGTTG CCTGAATCCT CTCATCTATG CATTTGCTGG GGAGAAGTTC 900 

AGAAGATACC TTTACCACCT GTATGGGAAA TGCCTGGCTG TCCTGTGTGG GCGCTCyiGTC 960 

CACGTTGATT TCTCCTCATC TGTJ^TCACAA AGGAGCAGGC ATGGAAGTGT TCTGAGCAGC 1020 

AATTTTACTT ACCACACGAG TGATGGAGAT GCATTGCTCC TTCTCTGA 1068 
(233) INFORMATION FOR SEQ ID NO:232: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 232: 

Met Asp Gin Phe Pro Glu Ser Val Thr Glu Asn Phe Glu Tyr Asp Asp 
.1 5 10 . 15 

Leu Ala Glu Ala Cys Tyr lie Gly Asp lie Val Val Phe Gly Thr Val 
20 25 30 

Phe Leu Ser lie Phe Tyr Ser Val lie Phe Ala lie Gly Leu Val Gly 
35 40 45 

Asn Leu Leu Val Val Phe Ala Leu Thr Asn Ser Lys Lys Pro Lys Ser 
50 55 60 

Val Thr Asp lie Tyr Leu Leu Asn Leu Ala Leu Ser Asp Leu Leu Phe 
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" '5 80 

Val Ala Thr Leu Pro Phe Trp Thr His Tyr Leu He Asn Glu Lys Gly 



95 



Leu His Asn Ala Met Cys Lys Phe Thr Thr Ala Phe Phe Phe He Glv 
100 105 

Phe Phe Gly Ser He Phe Phe He Thr Val He Ser He Asp Arg Tyr 

120 125 

Leu Ala He Val Leu Ala Ala Asn Ser Met Asn Asn Arg Thr Val 



135 



Gin 



140 



His Gly val Thr He Ser Leu Gly Val Trp Ala Ala Ala He Leu Val 



155 



160 



Ala Ala Pro Gin Phe Met Phe Thr Lys Gin Lys Glu Asn Glu Cys Leu 
165 170 



Gly ASP Tyr Pro Glu Val Leu Gin Glu He Trp Pro Val L^u Arg Asn 
180 185 

Val Glu Thr Asn Phe Leu Gly Phe Leu Leu Pro Leu Leu He Met Ser 
195 200 205 

Tyr cys Tyr Phe Arg He He Gin Thr Leu Phe Ser Cys Lys Asn His 



220 



Lys Lys Ala Lys Ala Lys Lys Leu He Leu Leu Val Val Ha Val Phe 
■ 230 235 



240 

Phe Leu Phe Trp Thr Pro Tyr Asn Val Met He Phe Leu Glu Thr Leu 
245 250 255 

Lys Leu. Tyr Asp Phe Phe Pro Ser Cys Asp Met Arg Lye Asp Leu Arg 
260 265 270 

Leu Ala Leu Ser Val Thr Glu Thr Val Ala Phe Ser His Cys Cys Leu 
275 280 285 

Asn Pro Leu He Tyr Ala Phe Ala Gly Glu Lys Phe Arg Arg lyr Leu 

Tyr His Leu Tyr Gly Lys Cys Leu Ala Val Leu Cys Gly Arg Ser Val 

315 320 

His val ASP Phe Ser Ser Ser Glu Ser Gin Arg Ser Arg His Gly Ser 
325 330 

val Leu ser Ser Asn Phe Thr T^r His Thr ser Asp Gly Asp Ala Leu 

345 

Leu Leu Leu 
355 
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(234) INFORMATION FOR SEQ ID NO: 233: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:233: 
10 GGCTTAAGAG CATCATCGTG GTGCTGGTG ; 

(235) INFORMATION FOR SEQ ID NO:234: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI -SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:/234: 
20 GTCACCACCA GCACCACGAT GATGCTCTTA AGCC 3 

(236) INFORMATION FOR SEQ ID NO:235: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 235: 

CAAAGAAAGT ACTGGGCATC GTCTTCTTCC T 3 
30 (237) INFORMATION FOR SEQ ID NO: 236: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 236: 
TGCTCTAGAT TCCAGATAGG TGAAAACTTG 

30 

(238) INFORMATION FOR SEQ ID NO. 237: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
10 (iv) ANTI-SENSE: NO 

(xi). SEQUENCE DESCRIPTION: SEQ ID NO: 237: 
CTAGGGGCAC CATGCAGGCT ATCAACAATC J^GMU^GC TAAGAAAGTC 50 
(239) INFORMATION FOR SEQ ID NO: 238: 



15 



20 



25 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI- SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 8: 
CAAGGACTTT CTTAGCTTTT CTTTCATTGT TGATAGCCTG CATGGTGCCC ' 50 
(240) INFORMATION FOR SEQ ID N0:239: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 239: 
CGGCGGCAGA AGGCGAAACG CATGATCCTC GCGGT 

35 

(241) INFORMATION FOR SEQ ID N0:240: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQtJENCE DESCRIPTION: SEQ ID N0:240: 
ACCGCGAGGA TCATGCGTTT CGCCTTCTGC CGCCG 

(242) INFORMATION FOR SEQ ID NO: 241: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241: 
GAGACATATT ATCTGCCACG GAGG 

(243) INFORMATION FOR SEQ ID NO: 242: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: 
TTGGCATAGA AACCGGACCC AAGG 

(244) INFORMATION FOR SEQ ID NO: 243: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 243: 
TAAGAATTCC ATAAAAATTA TGGAATGG 
(245) INFORMATION FOR SEQ ID NO: 244: 

(i) SEQUENCE CHARACTERISTICS: 



10 
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(A) LENGTH: 30 base pairs 

(B) . TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 244: 
CCAGGATCCA GCTGAAGTCT TCCATCATTC 
(246) INFORMATION FOR SEQ ID NO: 245: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1071 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(^i) SEQUENCE DESCRIPTION: SEQ ID NO: 245: 
ATGAATGGGG TCTCGGAGGG GACCAGAGGC TGCAGTGACA GGCAACCTGG GGTGCTGACA 
CGTGATCGCT CTTGTTCCAG GAAGATGAAC TCTTCCGGAT GCCTGTCTGA GGAGGTGGGG 
TCCCTCCGCC CACTGACTGT GGTTATCCTG TCTGCGTCCA TTGTCGTCGG AGTGCTGGGC 
AATGGGCTGG TGCTGTGGAT GACTGTCTTC CGTATGGCAC GCACGGTCTC CACCGTCTGC 
20 TTCTTCCACC TGGCCCTTGC CGATTTCATG CTCTCACTGT CTCTGCCCAT TGCCATGTAC 
TATATTGTCT CCAGGCAGTG GCTCCTCGGA GAGTGGGCCT GCAAACTCTA CATCACCTTT 
GTGTTCCTCA GCTACTTTGC CAGTAACTGC CTCCTTCTCT TCATCTCTGT GGACCGTTGC 
ATCTCTGTCC TCTACCCCGT CTGGGCCCTG AACCACCGCA CTGTGCAGCG GGCGAGCTGG 
CTGGCCTTTG GGGTGTGGCT CCTGGCCGCC GCCTTGTGCT CTGCGCACCT GAAATTCCGG 
25 ACAACCAGAA AATGGAATGG CTGTACGCAC TGCTACTTCG CGTTCAACTC TGACAATGAG 
ACTGCCCAGA TTTGGATTGA AGGGGTCGTG GAGGGACACA TTATAGGGAC CATTGGCCAC 
TTCCTGCTGG GCTTCCTGGG GCCCTTAGCA ATCATAGGCA CCTGCGCCCA CCTCATCCGG 
GCCAAGCTCT TGCGGGAGGG CTGGGTCCAT GCCAACCGGC CCGCGAGGCT GCTGCTGGTC 
CTGGTGAGCG CTTTCTTTAT CTTCTGGTCC CCGTTTAACG TGGTGCTGTT GGTCCATCTG 
30 TGGCGACGGG TGATGCTCAA GGAAATCTAC CACCCCCGGA TGCTGCTCAT CCTCCAGGCT 
AGCTTTGCCT TGGGCTGTGT CAACAGCAGC CTCAACCCCT TCCTCTACGT CTTCGTTCGC 



30 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
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AGAGATTTCC AAGAAAAGTT TTTCCAGTCT TTGACTTCTG CCCTGGCGAG GGCGTTTGGA 1020 
GAGGAGGAGT TTCTGTCATC CTGTCCCCGT GGCAACGCCC CCCGGGAATG A 1071 
(247) INFORMATION FOR SEQ ID N0:246: 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 356 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 6: 

Met Asn Gly Val Ser Glu Gly Thr Arg Gly Cys Ser Asp Arg Gin Pro 
15 10 15 ■ 

Gly Val Leu Thr Arg Asp Arg Ser Cys Ser Arg Lys Met Asn Ser Ser 
20 25 30 

15 Gly Cys Leu Ser Glu Glu Val Gly Ser Leu Arg Pro Leu Thr Val Val 

35 40 45 

lie Leu Ser Ala Ser He Val Val Gly Val Leu Gly Asn Gly Leu Val 
50 55 60 

Leu Trp Met Thr Val Phe Arg Met Ala Arg Thr Val Ser Thr Val Cys 
20 65 . 70 " 75 80 

Phe Phe His Leu Ala Leu Ala Asp Phe Met Leu Ser Leu Ser Leu Pro 
85 90 95 

He Ala Met Tyr Tyr He Val Ser Arg Gin Trp Leu Leu Gly Glu Trp 
100 105 110 

25 Ala Cys Lys Leu Tyr He Thr Phe Val Phe Leu Ser Tyr Phe Ala Ser 

115 120 125 

Asn Cys Leu Leu Val Phe He Ser Val Asp Arg Cys He Ser Val Leu 
130 135 140 

Tyr Pro Val Trp Ala Leu Asn His Arg Thr Val Gin Arg Ala Ser Trp 
30 145 150 155 160 

Leu Ala Phe Gly Val Trp Leu Leu Ala Ala Ala Leu Cys Ser Ala His 
165 170 175 

Leu Lys Phe Arg Thr Thr Arg Lys Trp Asn Gly Cys Thr His Cys Tyr 
180 185 190 



35 



Leu Ala Phe Asn Ser Asp Asn Glu Thr Ala Gin He Trp He Glu Gly 
195 200 205 
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Val Val Glu Gly His He He Gly Thr He Gly His Phe Leu Leu Gly 

215 220 

Phe Leu Gly Pro Leu Ala He He Gly Thr Cys Ala His Leu He Arg 

235 

Ala Lys Leu Leu Arg Glu Gly Trp Val His Ala Asn Arg Pro Ala Arg 
245 250 255 

Leu Leu Leu Val Leu Val Ser Ala Phe Phe He Phe Trp Ser Pro Phe 

. 265 270 

Asn val val Leu Leu Val His Leu Trp Arg Arg Val Met Leu Lys Glu 
275 280 



285 



He Tyr His Pro Arg Met Leu Leu He Leu Gin Ala Ser Phe Ala Leu 

295 

Gly cys Val Asn Ser Ser Leu Asn Pro Phe Leu Tyr Val Phe Val Gly 

310 32$ 

Arg Asp Phe Gin Glu Lys Phe Phe Gin Ser Leu Thr Ser Ala Leu Ala 



325 



330 



335 



Arg Ala Phe Gly Glu Glu Glu Phe Leu Ser Ser Cys Pro Arg Gly Asn 
340 345 



350 



Ala Pro Arg Glu 

355 

(248) INFORMATION FOR SEQ ID NO : 247: 



(i) SEQUENCE' CHARACTERISTICS : 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 7: 
GCA6AATTCG GCGGCCCCAT GGACCTGCCC CC 
30 (249) INFORMATION FOR SEQ ID NO: 248: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 248: 



32 
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GCTGGATCCC CCGAGCAGTG GCGTTACTTC 30 

(250) INFORMATION FOR SEQ ID NO: 24 9: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 903 base pairs 

5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 249: 

10 ATGGACCTGC CCCCGCAGCT CTCCTTCGGC CTCTATGTGG CCGCCTTTGC GCTGGGCTTC 60 

CCGCTCAACG TCCTGGCCAT CCGAGGCGCG ACGGCCCACG CCCGGCTCCG TCTCACCCCT 120 

AGCCTGGTCT ACGCCCTGAA CCTGGGCTGC TCCGACCTGC TGCTGACAGT CTCTCTGCCC 180 

CTGAAGGCGG TG6AGGCGCT AGCCTCCGGG GCCTGGCCTC TGCCGGCCTC GCTGTGCCCC 240 

GTCTTCGCGG TGGCCCACTT CTTCCCACTC TATGCCGGCG GGGGCTTCCT GGCCGCCCTG 300 

15 AGTGCAGGCC GCTACCTGGG AGCAGCCTTC CCCTTGGGCT ACCAAGCCTT CCGGAGGCCG 360 

TGCTATTCCT GGGGGGTGTG CGCGGCCATC TGGGCCCTCG TCCTGTGTCA CCTGGGTCTG 420 

GTCTTTGGGT TGGAGGCTCC AGGAGGCTGG CTGGACCACA GCAACACCTC CCTGGGCATC 480 

AACACACCGG TCAACGGCTC TCCGGTCTGC CTGGAGGCCT GGGACCCGGC CTCTGCCGGC 540 

CCGGCCCGCT TCAGCCTCTC TCTCCTGCTC TTTTTTCTGC CCTTGGCCAT CACAGCCTTC 600 

20 TGCTACGTGG GCTGCCTCCG* GGCACTGGCC CGCTCCGGCC TGACGCACAG GCGGAAGCTG 660 

CGGGCCGCCT GGGTGGCCGG CGGGGCCCTC CTCACGCTGC TGCTCTGCGT AGGACCCTAC 720 

AACGCCTCCA ACGTGGCCAG CTTCCTGTAC CCCAATCTAG GAGGCTCCTG GCGGAAGCTG 780 

GGGCTCATCA CGGGTGCCTG GAGTGTGGTG CTTAATCCGC TGGTGACCGG TTACTTGGGA 840 

AGGGGTCCTG GCCTGAAGAC AGTGTGTGCG GCAAGAACGC AAGGGGGCAA GTCCCAGAAG 900 

25 TAA 

(251) INFORMATION FOR SEQ ID NO: 250: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 amino acids 

(B) TYPE: amino acid 

30 ( C ) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 



903 
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30 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 250: 



Met Asp Leu Pro Pro Gin Leu Ser Phe Gly Leu Tyr Val Ala Ala Phe 
5 10 



15 



Ala Leu Gly Phe Pro Leu Asn Val Leu Ala He Arg Gly Ala Thr Ala 
20 25 30 

His Ala Arg Leu Arg Leu Thr Pro Ser Leu Val Tyr Ala Leu Asn 



35 40 



Leu 
45 



10 ?r ^^"^ "^^^ I-ys Ala val 

55 60 



Glu Ala Leu Ala Ser Gly Ala Trp Pro Leu Pro Ala Ser Leu Cys Pro 
" 70 75 80 

val Phe Ala Val Ala His Phe Phe Pro Leu Tyr Ala Gly Gly Gly Phe 



85 90 



95 



e 
160 



Leu Ala Ala Leu Ser Ala Gly Arg Tyr Leu Gly Ala Ala Phe Pro Leu 
100 105 110 

Gly Tyr Gin Ala Phe Arg Arg Pro Cys Tyr Ser Trp Gly Val Cys Ala 
115 120 125 

Ala lie Trp Ala Leu Val Leu Cys His Leu Gly Leu Val Phe Gly Leu 
130 135 140 

Glu Ala Pro Gly Gly Trp Leu Asp His Ser Asn Thr Ser Leu Gly il 
145 150 155 ^ 

Asn Thr Pro Val Asn Gly Ser Pro Val Cys Leu Glu Ala Trp Asp Pro 

I'^O 175 

Ala Ser Ala Gly Pro Ala Arg Phe Ser Leu Ser Leu Leu Leu Phe Phe 
180 185 190 

Leu Pro Leu Ala He Thr Ala Phe Cys Tyr Val Gly Cys Leu Arg Ala 
195 200 205 

Leu Ala Arg Ser Gly Leu Thr His Arg Arg Lys Leu Arg Ala Ala Trp 

215 220 

Val.Ala Gly Gly Ala Leu Leu Thr Leu Leu Leu Cys Val Gly Pro Tyr 

230 



235 



240 



Asn Ala Ser Asn Val Ala Ser Phe Leu Tyr Pro Asn Leu Gly Gly Ser 
245 



250 



255 



Trp Arg Lys Leu Gly Leu lie Thr Gly Ala Trp Ser Val Val Leu Asn 
260 265 270 

Pro Leu Val Thr Gly Tyr Leu Gly Arg Gly Pro Gly Leu Lys Thr Val 
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275 280 285 

Cys Ala Ala Arg Thr Gin Gly Gly Lys Ser Gin Lys 
290 • 295 300 

(252) INFORMATION FOR SEQ ID NO:251: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 251: 
CTCAAGCTTA CTCTCTCTCA CCAGTGGCCA C 31 

(253) INFORMATION FOR SEQ ID NO:252: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

20 (xi) SEQtJENCE DESCRIPTION: SEQ ID NO: 2 52: 

CCCTCCTCCC CCGGAGGACC TAGC 24 

(254) INFORMATION FOR SEQ ID NO: 253: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1041 base pairs 

25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 253: 

30 ATGGATACAG GCCCCGACCA GTCCTACTTC TCCGGCAATC ACTGGTTCGT CTTCTCGGTG 60 

TACCTTCTCA CTTTCCTGGT GGGGCTCCCC CTCAACCTGC TQGCCCTGGT GGTCTTCGTG 120 

GGCAAGCTGC AGCGCCGCCC GGTGGCCGTG GACGTGCTCC TGCTCAACCT GACCGCCTCG 180 

GACCTGCTCC TGCTGCTGTT CCTGCCTTTC CGCATGGTGG AGGCAGCCAA TGGCATGCAC 240 

TGGCCCCTGC CCTTCATCCT CTGCCCACTC TCTGGATTCA TCTTCTTCAC CACCATCTAT 300 
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CTCACCGCCC TCTTCCTGGC AGCTGTCAGC ATTQAACGCT TCCTGAGTCT GGCCCACCCA 360 

CTGTGGTACA A6ACCCGGCC GAGGCTGGGG CAGGCAGGTC TGGTCAGTGT GGCCTGCTGG 420 

CTGTTGGCCT CTGCTCACTG CAGCGTGGTC TACGTCATAG AATTCTCAGG GGACATCTCC 480 

CACAGCCAGG GCACCAATGG GACCTGCTAC CTGGAGOTCC GGAAGGACCA GCTAGCCATC 540 

5 CTCCTGCCCG Tg6gGCTGGA GATGGCTGTG GTCCTCTTTG TGGTCCCGCT GATCATCACC 600 

AGCTACTGCT ACAGCCGCCT GGTGTGGATC CTCGGCAGAG GGGGCAGCCA CCGCCGGCAG 660 

AGGAGGQTQO CGGGGCTGTi- QGCGGCCACG CTGCTCAACT TCCTTGTCTG CTTTGGGCCC 720 

TACAACGTGT CCCATGTCGT GGGCTATATC TGCGGTOAAA GCCCGGCATG GAG6ATCTAC 780 

GTGACGCTTC TCAGCACCCT GAACTCCTGT GTCGACCCCT TTGTCTACTA CTTCTCCTCC 840 
10 TCCGGGTTCC AAGCCX3ACTT TCATGAGCTG CTGAGGAGGT TCTOTCGGCT CTGGGGCCAG 
TGGCAGCAGG AGAGCAGCAT GGAGCTGAAG GAGCAGAAGG GAGGGGAGGA GCAGAGAGCG 

GACCGACCAG CTGAAAGAAA GACCAGTGAA CACTCACAGG GCTGTGGAAC TCGTGGCCAG 1020 
GTGGCCTGTG CTGAAAGCTA G 

1041 

(255) INFORMATION FOR SEQ ID NO: 254: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 346 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 



20 



25 



30 



900 
960 



(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ IDNO:254: 

Met Asp Thr Gly Pro Asp Gin Ser Tyr Phe Ser Gly Asn His Trp Phe 
5 10 



15 



Val Phe ser Val Tyr Leu Leu Thr Phe Leu Val Gly Leu Pro Leu Asn 
20 25 30 

Leu Leu Ala Leu Val Val Phe Val Gly Lys Leu Gin Arg Arg Pro Val 
35 40 45 

Ala val ASP Val Leu Leu Leu Asn Leu Thr Ala Ser Asp Leu Leu Leu 

55 60 

Leu Leu Phe Leu Pro Phe Arg Met Val Glu Ala Ala Asn Gly Met His 

'° 75 . . 80 

Trp Pro Leu Pro Phe lie Leu Cys Pro Leu Ser Gly Phe lie Phe Phe 
85 



90 



95 
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Thr Thr He Tyr Leu Thr Ala Leu Phe Leu Ala Ala Val Ser He Glu 
100 105 110 

Arg Phe Leu Ser Val Ala His Pro Leu Trp Tyr Lys Thr Arg Pro Arg 
115 120 125 

Leu Gly Gin Ala Gly Leu Val Ser Val Ala Cys Trp Leu Leu Ala Ser 
130 135 140 

Ala His Cys Ser Val Val Tyr Val He Glu Phe Ser Gly Asp He Ser 
"5 150 155 160 

His Ser Gin Gly Thr Asn Gly Thr Cys Tyr Leu Glu Phe Arg Lys Asp 
165 170 175 

Gin Leu Ala He Leu Leu Pro Val Arg Leu Glu Met Ala Val Val Leu 
180 185 190 

Phe Val Val Pro Leu He He Thr Ser Tyr Cys Tyr Ser Arg Leu Val 
195 200 205 

Trp He Leu Gly Arg Gly Gly Ser His Arg Arg Gin Arg Arg Val Ala 
210 215 220 

Gly Leu Leu Ala Ala Thr Leu Leu Asn Phe Leu Val Cys Phe Gly Pro 
225 230 235 240 

Tyr Asn Val Ser His Val Val Gly Tyr He Cys Gly Glu Ser Pro Ala 
245 250 255 

Trp Arg He Tyr Val Thr Leu Leu Ser Thr Leu Asn Ser Cys Val Asp 
260 265 270 

Pro Phe Val Tyr Tyr Phe Ser Ser Ser Gly Phe Gin Ala Asp Phe His 
275 280 285 

Glu Leu Leu Arg Arg Leu Cys Gly Leu Trp Gly Gin Trp Gin Gin Glu 
290 295 300 

Ser Ser Met Glu Leu Lys Glu Gin Lys Gly Gly Glu Glu Gin Arg Ala 
305 310 315 320 

Asp Arg Pro Ala Glu Arg Lys Thr Ser Glu His Ser Gin Gly Cys Gly 
325 330 335 

Thr Gly Gly Gin Val Ala Cys Ala Glu Ser 
340 345 

(256) INFORMATION FOR SEQ ID NO:255: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:255: 
TTTAAGCTTC CCCTCCAGGA TGCTGCCGGA C 
(257) INFORMATION FOR SEQ ID NO: 2 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:256: 

GGCGAATTCT GAAGGTCCAG GGAAACTGCT A 
(258) INFORMATION FOR SEQ ID NO: 257: 

(i) SEQUENCE CHARACTERISTICS: 
^5 (A) LENGTH: 993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



10 



20 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 57: 
ATGCTGCCGG ACTGGAAGAG CTCCTTGATC CTCATGGCTT ACATCATCAT CTTCCTCACT 
6GCCTCCCTG CCAACCTCCT GGCCCTGCGG GCCTTTGTGG GGCGGATCCG CCAGCCCCAG 
CCTGCACCTG TGCACATCCT CCTGCTGAGC CTGACGCTGG CCGACCTCCT CCTGCTGCTG 
CTGCTGCCCT TCAAGATCAT CGAGGCTGCG TCGAACTTCC GCTCGTACCT GCCCAAGGTC 
25 GTCTGCGCCC TCACGAGTTT TGGCTTCTAC AGCAGCATCT ACTGCAGCAC GTGGCTCCTG 
GCGGGCATCA GCATCGAGCG CTACCT6GGA GTGGCTTTCC CCGTGCAGTA CAAGCTCTCC 
CGCCGGCCTC TGTATGGAGT GATTGCAGCT CTGGTGGCCT GGGTTATGTC CTTTCGTCAC 
TGCACCATCG TGATCATCGT TCAATACTTG AACACGACTG AGCAGGTCAG AAGTGGCAAT 
6AAATTACCT GCTACGAGAA CTTCACCGAT AACCAGTTGG ACGTGGT6CT GCCCGTGCGG 
30 CTGGAGCTGT GCCTGGTGCT CTTCTTCATC CCCATGGCAG TCACCATCTT CTGCTACTGG 
• CGTTTTGTGT GGATCATGCT CTCCCAGCCC CTTCTGGGGG CCCAGAGGCG GCGCCGAGCC 
GTGGGGCTGG CTGTGGTGAC GCTGCTCAAT TTCCTGGTGT GCTTCGGACC TTACAAOGTG 
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180 
240 
300 
360 
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720 



wo 00/22129 PCT/US99/23938 

213 

TCCCACCTGG TGGGGTATCA CCAGAGAAAA AGCCCCTGGT GGCGGTCAAT AGCCGTGGTG 780 

TTCAGTTCAC TCAACGCCAG TCTGGACCCC CTGCTCTTCT ATTTCTCTTC TTCAGTGGTG . 840 

CGCAGGGCAT TTGGGAGAGG GCTGCAGGTG CTGCGGAATC AGGGCTCCTC CCTGTTSGGA 900 

CGCAGAGGCA AAGACACAGC AGAGGGGAC7V AATGAGGACA GGGGTGTGGG TCAAGGAGAA 960 

5 GGGATGCCAA GTTCGGACTT CACTACAGAG TAG 993 
(259) INFORMATION FOR SEQ ID NO: 258: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 362 amino acids 

(B) TYPE: amino acid 
10 (C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 258: 

Met Leu Pro Asp Trp Lys Ser Ser Leu lie Leu Met Ala Tyr lie lie 
15 1 5 10 15 

He Phe Leu Thr Gly Leu Pro Ala Asn Leu Leu Ala Leu Arg Ala Phe 
20 25 30 

Val Gly Arg He Arg Gin Pro Gin Pro Ala Pro Val His He Leu Leu 
35 40 45 

20 Leu Ser Leu Thr Leu Ala Asp Leu Leu L,eu Leu Leu Leu Leu Pro Phe 

50 55 60 

Lys He He Glu Ala Ala Ser Asn Phe Arg Trp Tyr Leu Pro Lys Val 
«5 70 75 80 

Val Cys Ala Leu Thr Ser Phe Gly Phe Tyr Ser Ser He Tyr Cys Ser 
25 85 90 95 

Thr Trp Leu Leu Ala Gly He Ser He Glu Arg Tyr Leu Gly Val Ala 
100 105 no 

Phe Pro Val Gin Tyr Lys Leu Ser Arg Arg Pro Leu Tyr Gly Val lie 
115 120 125 

30 Ala Ala Leu Val Ala Trp Val Met Ser Phe Gly His Cys Thr He Val 

130 135 140 

He He Val Gin Tyr Leu Asn Thr Thr Glu Gin Val Arg Ser Gly Asn 
"5 150 155 160 

Glu He Thr Cys Tyr. Glu Asn Phe Thr Asp Asn Gin Leu Asp Val Val 
35 165 170 175 
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Leu Pro val Arg Leu Glu Leu Cys Leu Val Leu Phe Phe lie Pro Met 
"0 185 • 

Ala val Thr He Phe Cys Tyr Trp Arg Phe Val Trp He Met Leu Ser 
195 200 205 

Gin Pro Leu Val Gly Ala Gin Arg Arg Arg Arg Ala Val Gly Leu Ala 

215 220 

val val Thr Leu Leu Asn Phe Leu Val Cys Phe Gly Pro Tyr Asn Val 

230 



235 



240 



10 



ser His Leu Val Qly Tyr His Gin Arg Lys Ser Pro Trp Trp Arg Ser 
245 250 255 

He Ala val Val Phe Ser Ser Leu Asn Ala Ser Leu Asp Pro Leu Leu 
260 265 



270 



Phe Tyr Phe Ser Ser Ser Val Val Arg Arg Ala Phe Gly Arg Gly Leu 

275 280 



285 



Gin val Leu Arg Asn Gin Gly Ser Ser Leu Leu Gly Arg Arg Gly Lys 

295- 



300 



ASP Thr Ala Glu Gly Thr Asn Glu Asp Arg Gly Val Gly Gin Gly Glu 
-^^^ 310 



320 



Gly Met Pro Ser Ser Asp Phe Thr Thr Glu 
325 

(260) INFORMATION FOR SEQ ID NO : 259: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 259: 
CCCAAGCTTC GGGCACCATG GACACCTCCC 
(261) INFORMATION FOR SEQ ID NO: 260: 

(i) SEQUENCE CHARACTERISTICS:* 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 260: 



30 
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ACAGGATCCA AATGCACAGC ACTGGTAAGC 30 

(262) INFORMATION FOR SEQ ID NO: 261: 

(i) SEQXJENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 

5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 261: 
10 CTATAACTGG GTTACATGGT TTAAC 

(263) INFORMATION FOR SEQ ID NO: 262: 



25 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 262: 
TTTGAATTCA CATATTAATT AGAGACATG6 ' 30 

20 (264) INFORMATION FOR SEQ ID NO: 263: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2724 base pairs 

(B) .TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 263: 

ATGGACACCT CCCGGCTCGG TGTGCTCCTG TCCTTGCCTG TGCTGCTGCA GCT6GCGACC 60 

GGGGGCAGCT CTCCCAGGTC TGGTGTGTTG CTGAGGGGCT GCCCCACACA CTGTCATTGC 120 

30 GAGCCCGACG GCAGGATGTT GCTCAGGGTG GACTGCTCCG ACCTGGGGCT CTCGGAGCTG 180 

CCTTCCT^CC TCAGCGTCTT CACCTCCTAC CTAGACCTCA GTATGAACAA CATCAGTCAG 240 

CTGCTCCCGA ATCCCCTGCC CAGTCTCCGC TTCCTGGAGG AGTTACGTCT TGCGGGAAAC 300 

GCTCTGACAT ACATTCCCAA GGGAGCATTC ACTGGCCTTT ACAGTCTTAA AGTTCTTATG 360 
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CTGCAGAATA ATCAGCTAAG ACACGTACCC ACAGAAGCTC TGCAGAATTT GCGAAGCCTT 420 
CAATCCCTCC GTCTGGATGC TAACCACATC AGCTATCTCC CCCCAAGCTG TTTCAGTGGC 480 
CTGCATTCCC TGAGGCACCT . GTGGCTGGAT GACAATCCGT TAACAGAAAT CCCCGTCCAG 540 
GCTTTTAGAA GTTTATCGGC ATTGCAAGCC ATCACCTTGG CCCTGAACAA AATACACCAC 600 
5 ATACCAGACT ATGCCTTTGG AAACCTCTCC AGCTTCGTAG TTCTACATCT CCATAACAAT 660 
AGAATCCACT CCCTGGGAAA GAAATGCTTT GATGGGCTCC ACAGCCTAGA GACTTTAGAT 720 
TTAAATTACA ATAACCTTGA TCAATt6cCC ACTGCAATTA GGACACTCTC CAACCTTAAA 780 
GAACTAGGAT TTCATAGCAA CAATATCAGG TCGATACCTC AGAAAGCATT TGTA6GCAAC 840 
CCTTCTCTTA TTACAATACA TITCTATGAC AATCCCATCC AATTTCTTCQ OAGATCTCCT .900 
10 TTTCAACATT TACCTGAACT AAGAACACTC ACTCTGAATG GTGCCTCACA AATAACTGAA 960 
TTTCCTGATT TAACTGGAAC TGCAAACCTG GAGAGTCTGA CTTTAACTGG AGCACAGATC 1020 
TCATCTCTTC CTCAAACCGT CTGCAATCAG TTACCTAATC TCCAAGTGCT AGATCTGTCT 1080 
TACAACCTAT TAGAAGATTT ACCCAGTTTT TCAGTCTCCC AAAAGCTTCA GAAAATTGAC 1140 
CTAAGACATA ATGAAATCTA CGAAATTAAA GTTGACACTT TCCAGCAGTT GCTTAGCCTC 1200 
15 CGATCGCTGA ATTTGGCTTG GAACAAAATT GCTATTATTC ACCCCAATO^^ 1260 
TTGCCATCCC TAATAAAGCT GGACCTATCG TCCAACCTCC TGTCGTCTTT TCCTATAACT 1320 
GGGITACATG GT^AACTCA CTTAAAATTA ACAGGAAATC ATCCCTTACA GAGCTTGATA 1380 
TCATCTGAAA ACTTTCCAGA ACTCAAGGTT ATAGAAATGC CTTATGCTTA CCAGTGCTGT 1440 
GCATTTGGAG TGTGTGAGAA TGCCTATAAG ATTOCTAATC AATGGAATAA AGGTGACAAC 1500 

20 AGCAGTATGG ACGACCTTCA TAAGAAAGAT GCTGGAATGT TTCAGGCTCA AGAl^^ 1560 
GACCTTGAAG ATTTCCn^CT TGACTTTGAG GAAGACC^ 

TGITCACCTT CCCCAGGCCC CTTCAAACCC TGTGAACACC TGCTTGATGG CTGGCTGATC 1680 

AGAArrGGAG TCTGGACCAT AGCAGITCTG GCACTTACTT GTAATGCTTT GGTGACTTCA 1740 

ACAGTTTTCA GATCCCCTCT GTACATTTCC CCCATTAAAC TGTTAATTCG G6TCATCGCA 1800 

25 GCAGTGAACA TCCTCACGGG AGTCTCCAGT GCCGTCCTCG CTGGTGl^ TGCGrrC^^ I860 

TTTGGCAGCT TTCCACGACA TGGTGCCTGG TGGGAGAATC GGGTTGGTTG CCATGTCATT 1920 

GGrrTTTTCT CCATTTTTGC ITCAGAATCA TCTGTTTTCC TGCTTACTCT GGCAGCCCTG 1980 

GAGCGTGGGT TCTCTGTGAA ATATTCTCCA AAATTTGAAA CGAAAGCTCC ATTTTCTAGC 2040 
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CTGAAAGTAA TCATTTTGCT CTGTGCCCTG CTGGCCTTGA CCATGGCCGC AGTTCCCCTG 2100 

CTGGGTGGCA GCAAGTATGG CGCCTCCCCT CTCTGCCTGC CTTTGCCTTT TGGGGAGCCC 2160 

AGCACCATGG 6CTACATGGT C6CTCTCATC TTGCTCAATT CCCTTTGCTT CCTCATGATG 2220 

ACCATTGCCT ACACCAAGCT CTACTGCAAT TTGGACAAGG GAGACCTGGA GAATATTTGG 2280 

5 GACTGCTCTA T6GTAAAACA CATTGCCCTG TTGCTCTTCA CCAACTGCAT CCTAAACTGC 2340 

CCTGTGGCTT TCTTGTCCTT CTCCTCTTTA ATAAACCTTA CATTTATCAG TCCTGAAGTA 2400 

ATTAAGTTTA TCCTTCTGGT GGTAGTCCCA CTTCCTGCAT GTCTCAATCC CCTTCTCTAC 2460 

ATCTTGTTCA ATCCTCACTT TAAGGAGGAT CTGGTGAGCC TGAGAAAGCA AACCTACGTC 2520 

TGGACAAGAT CAAAACACCC AAGCTTGATG TCAATTAACT CTGATGATGT CGAAAAACAG 2580 

10 TCCTGTGACT CAACTCAAGC CTTGGTAACC TTTACCAGCT CCAGCATCAC TTATGACCTG 2640 

CCTCCCAGTT CCGTGCCATC ACCAGCTTAT CCAGTGACTG AGAGCTGCCA TCTTTCCTCT 2700 

GTGGCATTTG TCCCATGTCT CTAA 2724 
(265) INFORMATION FOR SEQ ID N0:264: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 907 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein . 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 264: 

Met Asp Thr Ser Arg Leu Gly Val Leu Leu Ser Leu Pro Val Leu Leu 
15 10 15 

Gin Leu Ala Thr Gly Gly Ser Ser Pro Arg Ser Gly Val Leu Leu Arg 
20 25 30 

25 Gly Cys Pro Thr His Cys His Cys Glu Pro Asp Gly Arg Met Leu Leu 

35 40 45 

Arg Val Asp Cys Ser Asp Leu Gly Leu Ser Glu Leu Pro Ser Asn Leu 
50 55 60 

Ser Val Phe Thr Ser Tyr Leu Asp Leu Ser Met Asn Asn He Ser Gin 
30 65 70 75 80 

^ Leu Leu Pro Asn Pro Leu Pro Ser Leu Arg Phe Leu Glu Glu Leu Arg 

85 90 95 

Leu Ala Gly Asn Ala Leu Thr Tyr He Pro Lys Gly Ala Phe Thr Gly 
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100 105 

Leu Tyr Ser Leu Lys Val Leu Met Leu Gin Asn Asn Gin Leu Arg His 

125 



val Pro Thr Glu Ala Leu Gin Asn Leu Arg Ser Leu Gin Ser Leu Arg 

140 

Leu Asp Ala Asn His lie Ser Tyr Val Pro Pro Ser Cys Phe Ser Gly 



155 



160 



Leu His ser Leu Arg His Leu Trp Leu Asp Asp Asn Ala Leu Thr Glu 
1« 170 

lie Pro val Gin Ala Phe Arg Ser Leu Ser Ala Leu Gin Ala Met Thr 
Leu Ala Leu Asn Lys He His His He Pro Asp Tyr Ala Phe Gly Asn 



195 200 205 

Ser Leu Val Val Leu His Leu His Asn 
210 215 



. Leu ser Ser Leu Val Val Leu His Leu His Asn Asn Arg He His Ser 

220 



Leu Gly Lys Lys Cys Phe Asp Gly Leu His Ser Leu Glu Thr Leu Asp 



235 



240 



Leu Asn Tyr Asn Asn Leu Asp Glu Phe Pro Thr Ala He Arg Thr Leu 
245 250 255 



Ser Asn Leu Lys Glu Leu Gly Phe His Ser Asn Asn He 



260 265 



Arg Ser He 
270 



Pro Glu Lys Ala Phe Val Gly Asn Pro Ser Leu He Thr He His Phe 
275 280 285 

ryr Asp Asn Pro He Gin Phe Val Gly Arg Ser Ala Phe Gin His Leu 

295 

Pro Glu Leu Arg Thr Leu Thr Leu Asn Gly Ala Ser Gin He Thr Glu 

315 320 
Phe Pro Asp Leu Thr Gly Thr Ala Asn Leu Glu Ser Leu Thr Leu Thr 



330 



335 



Gly Ala Gin He Ser Ser Leu Pro Gin Thr Val Cys Asn Gin Leu Pro 

345 350 

Asn Leu Gin Val Leu Asp Leu Ser Tyr Asn Leu Leu Glu Asp Leu Pro 
355 360 3g5 

ser Phe Ser Val Cys Gin Lys Leu Gin Lys He Asp Leu Arg His Asn 
Glu lie Tyr Glu He Lys Val Asp Thr Phe Gin Gin Leu Leu Ser Leu 



395 



400 
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Arg Ser Leu Asn Leu Ala Trp Asn Lys lie Ala lie lie His Pro Asn 
405 410 415 

Ala Phe Ser Thr Leu Pro Ser Leu lie Lys Leu Asp Leu Ser Ser Asn 
420 425 430 

5 Leu Leu Ser Ser Phe Pro lie Thr Gly Leu His Gly Leu Thr His Leu 

435 440 445 

Lys Leu Thr Gly Asn His Ala Leu Gin Ser Leu lie Ser Ser Glu Asn 
450 455 * 460 

Phe Pro Glu Leu Lys Val lie Glu Met Pro Tyr Ala Tyr Gin Cys Cys 
10 465 470 475 480 

Ala Phe Gly Val Cys Glu Asn Ala Tyr Lys He Ser Asn Gin Trp Asn 
485 490 495 



15 



Lys Gly Asp Asn Ser Ser Met Asp Asp Leu His Lys Lys Asp Ala Gly 
500 505 510 

Met Phe Gin Ala Gin Asp Glu Arg Asp Leu Glu Asp Phe Leu Leu Asp 
515 520 525 

Phe Glu Glu Asp Leu Lys Ala Leu His Ser Val Gin Cys Ser Pro Ser 
530 535 540 

Pro Gly Pro Phe Lys Pro Cys Glu His Leu Leu Asp Gly Trp Leu He 
20 545 550 555 560 

Arg He Gly Val Trp Thr He Ala Val Leu Ala Leu Thr Cys Asn Ala 
565 570 575 

Leu Val Thr Ser Thr Val Phe Arg Ser Pro Leu Tyr He Ser Pro He 
580 585 590 

25 Lys Leu Leu He Gly Val He Ala Ala Val Asn Met Leu Thr Gly Val 

595 600 605 

Ser Ser Ala Val Leu Ala Gly Val Asp Ala Phe Thr Phe Gly Ser Phe 
610 615 620 

Ala Arg His Gly Ala Trp Trp Glu Asn Gly Val Gly Cys His Val He 
30 625 630 635 640 

Gly Phe Leu Ser He Phe Ala Ser Glu Ser Ser Val Phe Leu Leu Thr 
645 650 655 

Leu Ala Ala Leu Glu Arg Gly Phe Ser Val Lys Tyr Ser Ala Lys Phe 
660 665 670 



35 



Glu Thr Lys Ala Pro Phe Ser Ser Leu Lys Val He He Leu Leu Cys 
675 680 685 

Ala Leu Leu Ala Leu Thr Met Ala Ala Val Pro Leu Leu Gly Gly Ser 
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20 
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"° "5 
Lys Tyr Gly Ala Ser Pro Leu Cys Leu Pro Leu Pro Phe Gly Glu Pro 



710 



715 



720 

ser Thr Met Gly Tyr Met Val Ala Leu He Leu Leu Asn Ser Leu Cys 
725 

Phe Leu Met Met Thr He Ala Tyr Thr Lys Leu Tyr Cys Asn Leu Asp 

■^45 750 

l-ys Gly ASP Leu Glu Asn He Trp Asp Cys Ser Met Val Lys His He 

760 

Ala Leu Leu Leu Phe Thr Asn Cys He Leu Asn Cys Pro Val Ala Phe 

■''^5 780 

Leu ser Phe Ser Ser Leu He Asn Leu Thr Phe He Ser Pro Glu Val 

800 

lie Lys Phe He Leu Leu Val Val Val Pro Leu Pro Ala Cys Leu Asn 

810 g3^5 

pro Leu Leu ^ He Leu Phe Asn Pro His Phe Lys Glu Asp Leu Val 



820 



825 



830 

ser Leu Arg Lys Gin Thr Tyr Val Trp Thr Arg Ser Lys His Pro Ser 
835 840 845 

Leu Met ser He Asn Ser Asp Asp Val Glu Lys Gin Ser Cys Asp Ser 

855 860 

Thr Gin Ala Leu Val Thr Phe Thr Ser Ser Ser He «.r 'Tyr Asp Leu 

^'^^ 880 
Pro Ser Pro Ala Tyr Pro Val tHt- o^^ 

885 



Pro Pro ser Ser Val Pro Ser Pro Ala Tyr Pro Val. Thr Glu Ser Cys 



890 



895 



His Leu Ser Ser Val Ala Phe Val Pro Cys Leu 
900 905 

(266) INFORMATION FOR SEQ ID N0:265: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:265: 
CGGAAGCT6C GGGCCAAATG GGTGGCCGGC 



30 
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(267) INFORMATION FOR SEQ ID NO:266: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 

CAGAGGAGGG TGAAGGGGCT GTTGGCG 

(268) INFORMATION FOR SEQ ID NO:267: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 7: 
GGCGGCGCCG AGCCAAGGGG CTGGCTGTGG 

(269) INFORMATION FOR SEQ ID NO: 268: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:268: 
GGGACTGCTC TATGAAAAAA CACATTGCCC TG 

(270) INFORMATION FOR SEQ ID NO: 269: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1071 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 269: 



ATGAATGGGG TCTCGGAGGG GACCAGAGGC TGCAGTGACA GGCAACCTGG GGTCCTGACA 
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CGTQATCGCT CTl^TTCCAG GAAGATGAAC TCTTCCGGAT GCCTGTCTGA GGAGG«3GGG 120 

TCCCTCCGCC CACTGACTGT GGTTATCCTG TCI^OGTCCA TT^TCGTCGG AGTGC1V3GGC X80 

AATGGGCTGG TGCTGTGGAT GACTGTCTTC CGTA1.«3CAC GCACGGTCTC CACCGTCl^c 240 

rrCTTCCACC TGGCCCTTGC CGAl^CATG CTCTCACTGT CTCTGCCCAT TGCCAI^TAC 300 

5 TATArraxcn. CCAGGCAGTO GCTCCTCGGA GAGTGGGCCT GCAAACTCTA CATO^^^ 360 

GTGTTCCTCA GCTACTTTGC CAGTAACTGC CTCCTTGTCT TCATCTCTGT GGACCGTTGC 420 

ATCTCTGTCC T^TACCCCGT CTGGGCCCTG AACCACCGCA CTGTGCAGCG GGCGAGCTGG 480 

CTGGCCTTTG GGGTGTGGCT CCTOGCCGCC GCCTTG.GCT CTGCGCACCT OAAATTCCGG 540 

ACAACCAGAA AATGGAATGG CTGTACGCAC TGCTACTTGG CGTTCAACTC TGACAATGAG 600 

10 ACTGCCCAGA TTTGGATTGA AGGGGTCGTG GAGGGACACA TTATAGGGAC CAITG^^^^ 660 

TTCCTGCTGG GCTTCCTGGG GCCCrTAGCA ATCATAGGCA CCTGCGCCCA CCTCATCCGG 720 

GCCAAGCTCT TGCGGGAGGG CTGGGTCCAT GCCAACCGGC CCAAGAGGCT GCTGCTGGTO 780 

CTGGTGAGCG CTTTCTTTAT CTTCTGGTCC CCGTTTAACG TGGTGCTGTT GGTCCATCIX. 840 
TGGCGACGGG TGATGCTCAA GGAAATCTAC CACCCCCGGA TGCTCCTCAT CCTCCAGGCT 
AGCTTTGCCT TGGGCTGTGT CAACAGCAGC CTCAACCCCT TCCTCTACGT CrTCGTTGGC 
AGAGAITTCC AAGAAAAGIT ITTCCAGTCT TTGACWCTC CCCTOGCGAG GGCGITTGGA 1020 
GAGGAGGAGT TTCTGTCATC CTGTCCCCGT GGCAACGCCC CCCGGGAATG A io71 
(271) INFORMATION FOR SBQ ID N0:270: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 356 amino acids 

(B) TYPE: amino acid 

(C) STRAIIDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:270: 

Met Asn Gly Val Ser Glu Gly Thr Arg Gly Cy. Ser Asp Arg Gin Pro 

10 15 

Gly val Leu Thr Arg Asp Arg Ser Cys Ser Arg Lys Met Asn Ser Ser 

25 -30 
Gly Cys Leu Ser Glu Glu Val Gly Ser Leu Arg Pro Leu Thr Val Val 



900 
960 



40 45 
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He Leu Ser Ala Ser He Val Val Gly Val Leu Gly Asn Gly Leu Val 
50 55 60 

Leu Trp Met Thr Val Phe Arg Met Ala Arg Thr Val Ser Thr Val Cys 
€5 70 75 80 

5 Phe Phe His Leu Ala Leu Ala Asp Phe Met Leu Ser Leu Ser Leu Pro 

85 90 95 

lie Ala Met Tyr Tyr He Val Ser Arg Gin Trp Leu Leu Gly Glu Trp 
100 105 110 

Ala Cys Lys Leu Tyr He Thr Phe Val Phe Leu Ser Tyr Phe Ala Ser 
10 115 ' 120 125 

Asn Cys Leu Leu Val Phe He Ser Val Asp Arg Cys He Ser Val Leu 
130 135 140 



15 



Tyr Pro Val Trp Ala Leu Asn His Arg Thr Val Gin Arg Ala Ser Trp 
145 150 155 160 

Leu Ala Phe Gly Val Trp Leu Leu Ala Ala Ala Leu Cys Ser Ala His 
165 170 175 



Leu Lys Phe Arg Thr Thr Arg Lys Trp Asn Gly Cys Thr His Cys Tyr 
180 185 190 



20 



Leu Ala Phe Asn Ser Asp Asn Glu Thr Ala Gin He Trp He Glu Gly 
195 200 205 



Val Val Glu Gly His He He Gly Thr He Gly His Phe Leu Leu Gly 
210 215 220 



25 



Phe Leu Gly Pro Leu Ala He He Gly Thr Cys Ala His Leu He Arg 
225 230 235 240 

Ala Lys Leu Leu Arg Glu Gly Trp Val His Ala Asn Arg Pro Lys Arg 
245 250 255 



Leu Leu Leu Val Leu Val Ser Ala Phe Phe He Phe Trp Ser Pro Phe 
260 265 270 

Asn Val Val Leu Leu Val His Leu Trp Arg Arg Val Met Leu Lys Glu 
30 275 280 285 

He Tyr His Pro Arg Met Leu Leu He Leu Gin Ala Ser Phe Ala Leu 
290 295 300 



35 



Gly Cys Val Asn Ser Ser Leu Asn Pro Phe Leu Tyr Val Phe Val Gly 
305 310 315 320 

Arg Asp Phe Gin Glu Lys Phe Phe Gin Ser Leu Thr Ser Ala Leu Ala 
325 330 335 



Arg Ala Phe Gly Glu Glu Glu Phe Leu Ser Ser Cys Pro Arg Gly Asn 
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340 345 



350 



Ala Pro Arg Glu 
355 

(272) INFORMATION FOR SEQ ID NO:271: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 903 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA - (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:27l: 
ATGGACCTGC CCCCGCAGCT CTCCTTCGGC CTCTATGTGG CCGCCTTTGC GCTCGGCTTC 
CCGCTCAACG TCCTGGCCAT CCGAGGCGC6 ACGGCCCACG CCCGGCTCCG TCTCACCCCT 
AGCCTGGTCT ACGCCCTGAA CCTGGGCTGC TCCGACCTGC TGCTOACAGT CTCTCTGCCC 
CTGAAGGCGG TGGAGGCGCT AGCCTCCGGG GCCTGGCCTC TGCCGGCCTC GCTGTGCCCC 
GTCTTCGCGG TGGCCCACTT CTTCCCACTC TATGCCGGCG GGGGCTTCCT dGCCGCCCTG 
AGTGCAGGCC GCTACCTGGG AGCAGCCTTC CCCTTGGGCT ACCAAGCCTT CCGGAGGCCG 
TGCTATTCCT GGGGGGTGTG CGCGGCCATC TCGGCCCTCG TCCTGTGTCA CCTGGGTCTG 
GTCTTTGGGT TGGAGGCTCC AGGAGGCTGG CTGGACCACA GCAACACCTC CCTGGGCATC 
AACACACCGG TCAACGGCTC TCCGGTCTGC CTGGAGGCCT GGGACCCGGC CTCTGCCGGC 
CCGGCCCGCT TCAGCCTCTC TCTCCTGCTC TTTTTTCTGC CCTTGGCCAT CACAGCCTTC 
TGCTACGTGG GCTGCCTCCG GGCACTGGCC CGCTCCGGCC TGACGCACAG GCGGAAGCTG 
CGGGCCAAAT GGGTGGCCGG CGGGGCCCTC CTCACGCTGC TGCTCTGCGT AGGACCCTAC 
AACGCCTCCA ACGTGGCCAG CITCCTGTAC CCCAATCTAG GAGGCTCCTO GCGGAAGCTC 
GGGCTCATCA CGGGTGCCTG GAGTGTGGTG CTTAATCCGC TGGTGACOGG TTACTTGGGA 
AGGGGTCCTG GCCTGAAGAC AGTGTGTGCG GCAAGAACGC AAGGGGGCAA GTCCCAGAAG 
TAA 

(273) INFORMATION FOR SEQ ID NO: 272: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
903 
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(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 72: 

Met Asp Leu Pro Pro Gin Leu Ser Phe Gly Leu Tyr Val Ala Ala Phe 
5 1 5 10 15 

Ala Leu Gly Phe Pro Leu Asn Val Leu Ala lie Arg Gly Ala Thr Ala 
20 25 30 

His Ala Arg Leu Arg Leu Thr Pro Ser Leu Val Tyr Ala Leu Asn Leu 
35 .40 45 

10 Gly Cys Ser Asp Leu Leu Leu Thr Val Ser Leu Pro Leu Lys Ala Val 

50 55 60 

Glu Ala Leu Ala Ser Gly Ala Trp Pro Leu Pro Ala Ser Leu Cys Pro 
65 70 75 80 

Val Phe Ala Val Ala His Phe Phe Pro Leu Tyr Ala Gly Gly Gly Phe 
15 85 90 95 

Leu Ala Ala Leu Ser Ala Gly Arg Tyr Leu Gly Ala Ala Phe Pro Leu 
100 105 110 

Gly Tyr Gin Ala Phe Arg Arg Pro Cys Tyr Ser Trp Gly Val Cys Ala 
115 120 125 

20 Ala He Trp Ala Leu Val Leu Cys , His Leu Gly Leu Val Phe Gly Leu 

130 135 140 

Glu Ala Pro Gly Gly Trp Leu Asp His Ser Asn Thr Ser Leu Gly He 
145 150 155 160 

Asn Thr Pro Val Asn Gly Ser Pro Val Cys Leu Glu Ala Trp Asp Pro 
25 165 170 175 

Ala Ser Ala Gly Pro Ala Arg Phe Ser Leu Ser Leu Leu Leu Phe Phe 
180 185 190 

Leu Pro Leu Ala He Thr Ala Phe Cys Tyr Val Gly Cys Leu Arg Ala 
195 200 205 

30 Leu Ala Arg Ser Gly Leu Thr His Arg Arg Lys Leu Arg Ala Lys Trp 

210 215 220 

Val Ala Gly Gly Ala Leu Leu Thr Leu Leu Leu Cys Val Gly Pro Tyr 
225 230 235 240 

Asn Ala Ser Asn Val Ala Ser Phe Leu Tyr Pro Asn Leu Gly Gly Ser 
35 245 250 255 

Trp Arg Lys Leu Gly Leu He Thr Gly Ala Trp Ser Val Val Leu Asn 
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270 

Arg Gly Pro Gly Leu 
280 285 



Pro Leu val Thr Gly Tyr Leu Gly Arg Gly Pro Gly Leu Lys Thr Vai 
Cys Ala Ala Arg Thr Gin Gly Gly Lys Ser Gin Lys 



290 295 



10 



60 
120 



300 

(274) INFORMATION FOR SEQ ID NO: 273: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1041 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 73: 

ATGGATACAG GCCCCGACCA GTCCTACTTC TCCGGCAATC ACTC6TTCGT CTTCTCGGTO 

15 TACCTTCTCA CTTTCCTGGT GGGGCTCCCC CTCAACCTGC TGGCCCTG6T GGTCTTCGTG 

GGCAAGCTGC AGCGCCGCCC G6TGGCCGTG GACGTOCTCC TGCTCAACCT GACCGCCTCG ISO 

GACCTGCTCC TGCTGCTGTT CCTGCCTTTC CGCA-HSGTGG AGGCAGCCAA TGGCATGCAC 240 

TGGCCCCTGC CCITCATCCT CTCCCCACTC TCTGGATTCA TCTTCrrCAC CACCATCTAT 300 

CTCACCGCCC TCTTCCTCGC AGCTGTGAGC ATTGAACGCT TCCTGAGTGT GGCCCACCCA 

20 CTGTGGTACA AGACCCGGCC GAGGCTGGGG CAGGCAGGTC TGGTGAGTGT GGCCTGCTGG 

CTGTTGGCCT CTGCTCACTG CAGCGTGGTC TACGTCATAG AATTCTCAGG GGACATCTCC 480 

CACAGCCAGG GCACCAATGG GACCTGCTAC CTX3GAGTTCC GGAAGGACCA GCTAGCCATC 540 

CTCCTGCCCG TGCGGCTGGA GATG6CTGTG GTCCTCTTTG TGGTCCCGCT GATCATCACC 600 

AGCTACTGCT ACAGCCGCCT GGTCTGGATC CTCGGCAGAG GGGGCAGCCA CCGCCGGCAG 660 

25 AGGAGGGTGA AGGGGCTGTT GGCGGCCACG CTCCTCAACT TCCITGTCTC CTTTGGGCCC 720 

TACAACGTGT CCCATQTCGT GGGCTATATC TGCGQTGAAA GCCCGGCATG GAGGATCTAC 780 

GTGACGCTTC TCAGCACCCT GAACTCCTGT GTCGACCCCT TTGTCTACTA CTTCTCCTCC 840 

TCCGGGTTCC AAGCCGACTT TCATGAGCTG CTGAGGAGGT TGTGTGGGCT CTGGGGCCAG 900 

TGGCAGCAGG AGAGCAGCAT GGAGCTGAAG GAGCAQAAGG GAGGGGAGGA GCAGAGAGCG 960 

30 GACCGACCAG CT6AAAGAAA GACCAGTGAA CACTCACAGG GCTGTGGAAC TGGTGGCCAG 1020 
GTGGCCTGTG CTG^AAGCTA G 

1041 



360 
420 
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(275) INFORMATION FOR SEQ ID N0:274; 



( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 346 amino acids 

(B) TYPE: amino acid 
5 (C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

' (ii) MOI^CULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 274: 



Met Asp Thr Gly Pro Asp Gin Ser 
10 1 5 

Val Phe Ser Val Tyr Leu Leu Thr 
20 

Leu Leu Ala Leu Val Val Phe Val 
35 40 

15 Ala Val Asp Val Leu Leu Leu Asn 

50 55 



Tyr Phe Ser Gly Asn His Trp Phe 
10 15 

Phe Leu Val Gly Leu Pro lieu Asn 
25 30 

Gly Lys Leu Gin Arg Arg Pro Val 
45 

Leu Thr Ala Ser Asp Leu Leu Leu 
60 



Leu Leu Phe Leu Pro Phe Arg Met Val Glu Ala Ala Asn Gly Met His 
65 70 75 80 

Trp Pro Leu Pro Phe He Leu Cys Pro Leu Ser Gly Phe He Phe Phe 
20 85 90 95 

Thr Thr He Tyr Leu Thr Ala Leu Phe Leu Ala Ala Val Ser He Glu 
100 105 110 

Arg Phe Leu Ser Val Ala His Pro Leu "Trp Tyr Lys Thr Arg Pro Arg 
115 120 125 

25 Leu Gly Gin Ala Gly Leu Val Ser Val Ala Cys Trp Leu Leu Ala Ser 

130 135 140 

Ala His Cys Ser Val Val Tyr Val He Glu Phe Ser Gly Asp He Ser 
145 150 155 160 

His Ser Gin Gly Thr Asn Gly Thr Cys Tyr Leu Glu Phe Arg Lys Asp 
30 165 170 175 

Gin Leu Ala He Leu Leu Pro Val Arg Leu Glu Met Ala Val Val Leu 
180 185 190 

Phe Val Val Pro Leu He He Thr Ser Tyr Cys Tyr Ser Arg Leu Val 
195 200 205 



35 



Trp He Leu Gly Arg Gly Gly Ser His Arg Arg Gin Arg Arg Val Lys 
210 215 220 

Gly Leu Leu Ala Ala Thr Leu Leu Asn Phe Leu Val Cys Phe Gly Pro 



10 
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228 

230 235 240 

Tyr Asn Val Ser His Val Val Gly Tyr He Cys Gly Glu Ser Pro Ala 
245 250 255 

Trp Arg He Tyr Val Thr Leu Leu Ser Thr Leu Asn Ser Cys Val Asp 

265 270 

Pro Phe Val Tyr Tyr Phe Ser Ser Ser Gly Phe Gin Ala Asp Phe His 
275 280 285 

Glu Leu Leu Arg Arg Leu Cys Gly Leu Trp Gly Gin Trp Gin Gin Glu 

295 300 

ser Ser Met Glu Leu Lys Glu Gin Lys Gly Gly Glu Glu Gin Arg Ala 

310 315 ^320 

Asp Arg Pro Ala Glu Arg Lys Thr Ser Glu His Ser Gin Gly Cys Gly 
325 330 y y y 



335 



15 



20 



Thr Gly Gly Gin Val Ala Cys Ala Glu Ser 
340 345 

(276) INFORMATION FOR SEQ ID NO:275: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECDLE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 75: 

ATGCTGCCQG ACTGGAAGAG CTCCTTGATC CTCATGGCTT ACATCATCAT CTTCCTCACT 60 

25 GGCCTCCCTG CCAACCTCCT 6GCCCTGCGG GCCTTTGTGG GGCGGATCCG CCAGCCCCAG 120 

CCTGCACCTG TGCACATCCT CCTGCTGA6C CTCACGCTGG CCGACCTCCT CCTGCTGCTG 180 

CTGCTGCCCT TCAAGATCAT CGAGGCTGCG TCGAACTTCC GCTGGTACCT GCCCAAGGTC 240 

GTCTGCGCCC TCACGAGTTT TGGdTCTAC AGCAGCATCT ACTGCAGCAC GTGGCTCCTG 300 

GCGGGCATCA GCATCGAGCG CTACCTGGGA GTCGCTTTCC CCGTGCAGTA CAAGCTCTCC 360 

30 CGCCGGCCTC TCTATGGAGT GATTGCAGCT CTGGTGGCCT GGGTTATGTC CTTTGGTCAC 420 

TGCACCATCG TGATCATCGT TCAATACTTG AACACQACTG AGCAGGTCAG AA6TGGCAAT 480 

GAAATTACCT 6CTACGAGAA CTTCACCGAT AACCAGTTCG ACGTGGTGCT GCCCGT6CGG 540 

CTGGAGCTGT 6CCTGGTGCT CTTCTTCATC CCCATGGCAG TCACCATCTT CTGCTACTGG 600 
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CGTTTTGTGT GGATCATGCT CTCCCAGCCC CTTGTGGGGG CCCAGAGGCG GCGCCGAGCC 660 

AAGGGGCTGG CTGTGGTGAC GCTGCTCAAT TTCCTGGTGT GCTTCGGACC TTACAACGTG 720 

TCCCACCTGG TGGGGTATCA CCAGAGAAAA AGCCCCTGGT GGCGGTCAAT AGCCGTGGTG 780 

TTCAGTTCAC TCAACGCCAG TCTGGACCCC CTGCTCTTCT ATTTCTCTTC TTCAGTGGTG 840 

5 CGCAGGGCAT TTGGGAGAGG GCTGCAGGTG CTGCGGAATC AGGGCTCCTC CCTGTTGGGA 900 

CGCAGAGGCA AAGACACAGC AGAGGGGACA AATGAGGACA GGGGTGTGGG TCAAGGAGAA 960 

GGGATGCCAA GTTCGGACTT CACTACAGAG TAG 993 
(277) INFORMATION FOR SEQ ID NO: 2 76: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 330 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 276: 

Met Leu Pro Asp Trp Lys Ser Ser Leu He Leu Met Ala Tyr He He 
1.5 10 15 

He Phe Leu Thr Gly Leu Pro Ala Asn Leu Leu Ala Leu Arg Ala Phe 
20 25 30 

20 Val Gly Arg He Arg Gin Pro Gin Pro Ala Pro Val His He Leu Leu 

35 40 45 

Leu Ser Leu Thr Leu Ala Asp Leu Leu Leu Leu Leu Leu Leu Pro Phe 
50 55 60 

Lys He He Glu Ala Ala Ser Asn Phe Arg Trp Tyr Leu Pro Lys Val 
25 65 70 75 80 

Val Cys Ala Leu Thr Ser Phe Gly Phe Tyr Ser Ser He Tyr Cys Ser 
85 90 95 

Thr Trp Leu Leu Ala Gly He Ser He Glu Arg Tyr Leu Gly Val Ala 
100 105 110 

Phe Pro Val Gin Tyr Lys Leu Ser Arg Arg Pro Leu Tyr Gly Val He 
115 120 125 

Ala Ala Leu Val Ala Trp Val Met Ser Phe Gly His Cys Thr He Val 
130 135 

He He Val Gin Tyr Leu Asn Thr Thr Glu Gin Val Arg Ser Gly Asn 
35 145 150 155 160 



30 
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Glu lie Thr Cys Tyr Glu Asn Phe Thr Asp Asn Gin Leu Asp Val Val 



170 



175 



X-eu Pro val Arg Leu Glu Leu Cys Leu Val Leu Phe Phe He Pro Met 



185 



190 



10 



15 



Ala val Thr He Phe Cys .yr Trp Arg Phe Val Trp zie Met Leu Ser 

205 

P„ L.„ V.1 OXy «. ^ 

220 

Val Val Thr Leu Leu Asn Phe Leu Val Cys Phe ri^ d^^ ^ . 
225 oin ^ ^"'■y Asn Val 

-^-^5 240 

ser His Leu Val Gly ryr His Gin Arg Lys Ser Pro Trp Trp Arg Ser 

250 255 
He Ala Val Val Phe Ser Ser Leu Asn Ala Ser t»„ ^ „ 

265 270 
Ph. Tyr Ph, s.r s.r s.r v.a v.I ^ ^ ^„ 

285 

Gin val Leu Arg Asn Gin Gly Ser Ser Leu Leu Gly Arg Arg Gly Lys 

300 



20 



ASP Thr Ala Glu Gly Thr Asn Glu Asp Arg Gly val Gly Gin Gly Glu 



315 



320 



Gly Met Pro Ser Ser Asp Phe Thr Thr Glu 
325 

(278) INFORMATION FOR SEQ ID NO: 2 77: 

^. (i) SEQUENCE CHARACTERISTICS: 

^ <A) LENGTH: 2724 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

<^i> SEQUENCE DESCRIPTION: SEQ ID NO: 277: 
ATGGACACCT CCCGGCTCGG TGTGCTCCTG T^CTTGCCT^ TGCTGCTGCA GC^I^GCGACC 60 
GGGGGCAGCT CT^CCAGGTC TOOTCrOTTO CTGAGGGGCT GCCCCACACA CTOTCATTGC 120 

GAGCCCGACG GCAGGATGTT GCTCAGGGT^ GACTGCTCCG ACCTGGGGCT CTCGGAGCT^ .80 
CCTTCCAACC TCAGCGTCTX CACCTCCTAC CTAGACC«:a GTATGAACAA CATCAGl^G 240 

35 ^^C'CCCX^ ATCCCCTGCC CAGTCTCCGC rrCCTGGAGG AGTTACGTCT TGCGGGAAAC 300 
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GCTCTGACAT ACATTCCCAA GGGAGCATTC ACTGGCCTTT ACAGTCTTAA AGTTCTTATG 360 

CTGCAGAATA ATCAGCTAAG ACACGTACCC ACAGAAGCTC TGCAGAATTT GCGAAGCCTT 420 

CAATCCCTGC GTCTGGATGC TAACCACATC AGCTATGTGC CCCCAAGCTG TTTCAGTGGC 4G0 

CTGCATTCCC TGAGGCACCT GTGGCTGGAT GACAATGCGT TAACAGAAAT CCCCGTCCAG 540 

5 GCTTTTAGAA GTTTATCGGC ATTGCAAGCC ATGACCTTGG CCCTGAACAA AATACACCAC 600 

ATACCAGACT ATGCCTTTGG AAACCTCTCC AGCTTGGTAG TTCTACATCT CCATAACAAT 660 

AGAATCCACT CCCTGGGAAA GAAATGCTTT GATGGGCTCC ACAGCCTAGA GACTTTAGAT 720 

TTAAATTACA ATAACCTTGA TGAATTCCCC ACTGCAATTA GGACACTCTC CAACCTTAAA 780 

GAACTAGGAT TTCATAGCAA CAATATCAGG TCGATACCTG AGAAAGCATT TGTAGGCAAC 840 

10 CCTTCTCTTA TTACAATACA TTTCTAT6AC AATCCCATCC AATTTGTTGG GAGATCTGCT 900 

TTTCAACATT TACCTGAACT AAGAACACTG ACTCTGAATG GTGCCTCACA AATAACTGAA 960 

TTTCCTGATT TAACTGGAAC TGCAAACCTG GAGAGTCTGA CTTTAACTGG AGCACAGATC 1020 

TCATCTCTTC CTCAAACCGT CTGCAATCAG TTACCTAATC TCCAAGTGCT AGATCTGTCT 1080 

TACAACCTAT TAGAAGATTT ACCCAGTTTT TCAGTCTGCC AAAAGCTTCA GAAAATTGAC 1140 

15 CTAAGACATA ATGAAATCTA CGAAATTAAA GTTGACACTT TCCAGCAGTT GCTTAGCCTC 1200 

CGATCGCTGA ATTTGGCTTG GAACAAAATT GCTATTATTC ACCCCAATGC ATTTTCCACT 1260 

TTGCCATCCC TAATAAAGCT GGACCTATCG TCCAACCTCC TGTCGTCTTT TCCTATAACT 1320 

G6GTTACATG GTTTAACTCA CTTAAAATTA ACAGGAAATC ATGCCTTACA GAGCTTGATA 1380 

TCATCTGAAA ACTTTCCAGA ACTCAAGGTT ATAGAAATGC CTTATGCTTA CCAGTGCTGT 1440 

20 GCATTTGGAG TGTGTGAGAA TGCCTATAAG ATTTCTAATC AATGGAATAA AGGTGACAAC 1500 

AGCAGTATGG ACGACCTTCA TAAGAAAGAT GCTGGAATGT TTCAGGCTCA AGATGAACGT 1560 

GACCTTGAAG ATTTCCTGCT TGACTTTGAG GAAGACCTGA AAGCCCTTCA TTCAGTGCAG 1620 

TGTTCACCTT CCCCAGGCCC CTTCAAACCC TGTGAACACC TGCTTGATGG CTGGCTGATC 1680 

AGAATTGGAG TGTGGACCAT AGCAGTTCTG GCACTTACTT GTAATGCTTT GGTGACTTCA 1740 

25 ACAGTTTTCA GATCCCCTCT GTACATTTCC CCCATTAAAC TGTTAATTGG GGTCATCGCA 1800 

GCAGTGAACA TGCTCACGGG AGTCTCCAGT GCCGTGCTGG CTGGTGTGGA TGCGTTCACT 1860 

TTTGGCAGCT TTGCACGACA TGGTGCCTGG TGGGAGAATG GGGTTGGTTG CCATGTCATT 1920 

GGTTTTTTGT CCATTTTTGC TTCAGAATCA TCTGTTTTCC TGCTTACTCT GGCAGCCCTG 1980 
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GAGCGTCGGT TCTCTGTGAA ATATTCTGCA AAATTTGAAA CGAAAGCTCC ATTTTCTAGC 2040 
CTCAAAGTAA TCATTTTGCT CTGTGCCCTG CTGGCCTTQA CCATGGCCGC AGITCCCCTC 2100 
CTCGGTGGCA GCAAOTATGG CGCCTCCCCT CTCTGCCTGC CTTTGCCITT •TOGGGAGCCC 2160 
AGCACCATGG GCTACATGGT CGCTCTCATC TTGCTCAATT CCCTTTGCrr CCTCATGATG 2220 
ACCATTOCCT ACACCAAGCT CTACTGCAAT TTGGACAAGG GAGACCTGGA GAATATTTGG 2280 
GACTGCTCTA TGAAAAAACA CATTGCCCTG TTCCTCTTCA CCAACTGCAT CCTAAACTGC 2340 
CCTGTGGCTT TClTGTCCrr CTCCTCTTTA ATAAACCTTA CArrTATCAG' TCCTOAAGTA 2400 
ATTAAGTTTA TCCTTCTGGT GGTAGTCCCA CTTCCTGCAT GTCTCAATCC CCTTCTCTAC 2460 
ATCTTGTTCA ATCCTCACTT TAAGGAGGAT Cr^GTGAGCC TGAGAAAGCA AACCTACGTC 2520 
TGGACAAGAT CAAAACACCC AAGCTTOATG TCAATTAACT CTGATGATGT CGAAAAACAG 2580 
TCCTGTCACT CAACTCAAGC CTTGGTAACC TTTACC7.GCT CCAGCATCAC TTATGACCTO 2640 
CCTCCCAGrr CCGTGCCATC ACCAGCTTAT CCAGTGACTG AGAGCTGCCA TCTTTCCTCT 2700 
GTGGCATTTG TCCCATGTCT CTAA 

2724 

(279) INFORMATION FOR SEQ ID NO: 278: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 907 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 78: 

Met ASP Thr Ser Arg Leu Gly Val Leu Leu Ser Leu Pro Val Leu Leu 
^ - 10 15 

Gin Leu Ala Thr Gly Gly Ser . Ser Pro Arg Ser Gly Val Leu Leu Arg 
20 25 30 

Gly Cys Pro Thr His Cys His Cys Glu Pro Asp Gly Arg Met Leu Leu 
35 40 45 

Arg Val Asp Cys Ser Asp Leu Gly Leu Ser Glu Leu Pro Ser Asn Leu 

60 

ser val Phe Thr Ser Tyr Leu Asp Leu Ser Met Asn Asn lie Ser Gin 

"^^ 80 
Leu Leu Pro Asn Pro Leu Pro Ser Leu Arg Phe Leu Glu Glu Leu Arg 



85 90 35 
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Leu Ala Gly Asn Ala Leu Thr Tyr lie Pro Lys Gly Ala Phe Thr Gly 
100 105 110 

Leu Tyr Ser Leu Lys Val Leu Met Leu Gin Asn Asn Gin Leu Arg His 
115 120 125 

Val Pro Thr Glu Ala Leu Gin Asn Leu Arg Ser Leu Gin Ser Leu Arg 
130 135 140 

Leu Asp Ala Asn His lie Ser Tyr Val Pro Pro Ser Cys Phe Ser Gly 
145 150 155 160 

Leu His Ser Leu Arg His Leu Trp Leu Asp Asp Asn Ala Leu Thr Glu 
165 170 175 

lie Pro Val Gin Ala Phe Arg Ser Leu Ser Ala Leu Gin Ala Met Thr 
180 185 190 

Leu Ala Leu Asn Lys lie His His lie Pro Asp Tyr Ala Phe Gly Asn 
195 200 205 

Leu Ser Ser Leu Val Val Leu His Leu His Asn Asn Arg lie His Ser 
210 215 220 

Leu Gly Lys Lys Cys Phe Asp Gly Leu His Ser Leu Glu Thr Leu Asp 
225 230 235 240 

Leu Asn Tyr Asn Asn Leu Asp Glu Phe Pro Thr Ala lie Arg Thr Leu 
245 250 255 

Ser Asn Leu Lys Glu Leu Gly Phe His Ser Asn Asn lie Arg Ser He 
260 265 270 

Pro Glu Lys Ala Phe Val Gly Asn Pro Ser Leu He Thr He His Phe 
275 280 285 

Tyr Asp Asn Pro He Gin Phe Val Gly Arg Ser Ala Phe Gin His Leu 
290 295 300 

Pro Glu Leu Arg Thr Leu Thr Leu Asn Gly Ala Ser Gin He Thr Glu 
305 310 315 320 

Phe Pro Asp Leu Thr Gly Thr Ala Asn Leu Glu Ser Leu Thr Leu Thr 
325 330 335 

Gly Ala Gin He Ser Ser Leu Pro Gin Thr Val Cys Asn Gin Leu Pro 
340 345 350 

Asn Leu Gin Val Leu Asp Leu Ser Tyr Asn Leu Leu Glu Asp Leu Pro 
355 360 365 

Ser Phe Ser Val Cys Gin Lys Leu Gin Lys He Asp Leu Arg His Asn 
370 375 380 



Glu He Tyr Glu He Lys Val Asp Thr Phe Gin Gin Leu Leu Ser Leu 
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385 



234 

400 



Arg Ser Leu Asn Leu Ala Trp Asn Lys He Ala He He His Pro Asn 

410 

Ala Phe ser Thr Leu Pro Ser Leu He Lys Leu Asp Leu Ser Ser Asn 

430 

Leu Leu Ser Ser Phe Pro He Thr Gly Leu His Gly Leu Thr His Leu 
435 

Lys Leu Thr Gly Akn His Ala Leu Gin Ser Leu lie Ser Ser Glu Asn 



Phe Pro Glu Leu Lys Val He Glu Met Pro Tyr Ala Tyr Gin- Cys Cys 

'^"^^ 480 

Ala Phe Gly val Cys Glu Asn Ala Tyr Lys He Ser Asn Gin Trp Asn 
485 490 

Lys Gly Asp Asn Ser Ser Met Asp Asp Leu His. Lys Lys Asp Ala Gly 
500 505 510 

Met Phe Gin Ala Gin Asp Glu Arg Asp Leu Glu Asp Phe Leu Leu Asp 

520 

Phe Glu Glu ASP Leu Lys Ala Leu His Ser Val Gin Cys Ser Pro Ser 

535 540 



Pro Gly Pro Phe Lys Pro Cys Glu His Leu Leu Asp Gly Trp Leu He 

"° 555 560 

Arg He Gly Val Trp Thr He Ala Val Leu Ala Leu Thr Cys Asn Ala 
565 570 

Leu val Thr Ser Thr Val Phe Arg Ser Pro Leu Tyr He Ser Pro He 

585 590 

Lys Leu Leu He Gly Val He Ala Ala Val Asn Met Leu Thr Gly Val 

600 605 

ser ser Ala Val Leu Ala Gly Val Asp Ala Phe Thr Phe Gly Ser Phe 

620 

Ala Arg His Gly Ala Trp Trp Glu Asn Gly Val Gly Cys His Val He 



635 



640 



Gly Phe Leu Ser He Phe Ala Ser Glu Ser Ser Val Phe Leu Leu Thr 

650 655 
Leu Ala Ala Leu Glu Arg Gly Phe Ser Val Lys Tyr Ser Ala Lys Phe 



665 



670 

Glu Thr Lys Ala Pro Phe Ser Ser Leu Lys Val He He Leu Leu Cys 



685 



wo 00/22129 



PCT/US99/23938 



235 

Ala Leu Leu Ala Leu Thr Met Ala Ala Val Pro Leu Leu Gly Gly Ser 
690 695 700 

Lys Tyr Gly Ala Ser Pro Leu Cys Leu Pro Leu Pro Phe Gly Glu Pro 
705 710 715 720 

5 Ser Thr Met Gly Tyr Met Val Ala Leu lie Leu Leu Asn Ser Leu Cys 

725 730 735 

Phe Leu Met Met Thr lie Ala Tyr Thr Lys Leu Tyr Cys Asn Leu Asp 
740 745 750 

Lys Gly Asp Leu Glu Asn He Trp Asp Cys Ser Met Lys Lys His He 
10 755 760 765 

Ala Leu Leu Leu Phe Thr Asn Cys He Leu Asn Cys Pro Val Ala Phe 
770 ' 775 780 

Leu Ser Phe Ser Ser Leu He Asn Leu Thr Phe He Ser Pro Glu Val 
785 790 795 800 

15 He Lys Phe He Leu Leu Val Val Val Pro Leu Pro Ala Cys Leu Asn 

805 810 815 

Pro Leu Leu Tyr He Leu Phe Asn Pro His Phe Lys Glu Asp Leu Val 
820 825 830 

Ser Leu Arg Lys Gin Thr Tyr Val Trp Thr Arg Ser Lys His Pro Ser 
20 835 840 845 

Leu Met Ser He Asn Ser Asp Asp Val Glu Lys Gin Ser Cys Asp Ser 
850 855 860 



25 



Thr Gin Ala Leu Val Thr Phe Thr Ser Ser Ser He Thr Tyr Asp Leu 
865 870 875 880 

Pro Pro Ser Ser Val Pro Ser Pro Ala Tyr Pro Val Thr Glu Ser Cys 
885 890 895 

His Leu Ser Ser Val Ala Phe Val Pro Cys Leu 
900 905 

(280) INFORMATION FOR SEQ ID NO: 279: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 279: 



CATGCCAACC GGCCCGCGAG GCTGCTGCTG GT 



32 
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(281) INFORMATION FOR SEQ ID NO: 280: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 280: 
ACCAGCAGCA GCCTCGCGGG CCGGTTGGCA TG 
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NON-ENDOGENOUS, CONSTITUnVELY ACTIVATED 
HUMAN G PROTEIN-COUPLED RECEPTORS 

The benefits of commonly owned U.S. Serial Number 09/170,496, filed 
October 13, 1998, U.S. Serial Number 08/839, 449 filed April 14, 1997 (now abandoned), 
5 U.S. Serial Number 09/060,188, filed April 14, 1998; U.S. Provisional Number 60/090,783, 

filed June 26, 1998; and U.S. Provisional Number 60/095,677, filed on August 7, 1998,are 
hereby claimed. Each of the foregoing applications are incorporated by reference herein in 
their entirety. 



FIELD OF THE INVENTION 
10 The invention disclosed in this patent document relates to transmembrane 

receptors, and more particularly to human G pn,tein-coupled receptor. (GPCRs) which have 

beenalteredsuchthatalteredGPCR3areconstitutivelyactivated.Mostpre^^^^ 
human GPCRs are used for the screening of therapeutic compounds. 
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BACKGROUND OF THE INVENTION 
Although a number of receptor classes exist in humans, by far the most abundant and 

therapeutically relevant is represented by theGprotein-coupledreceptor(GPCR or GPCRs)class. 
It is estimated that there are some 100,000 genes within the human genome, and of these, 
5 approximately 2% or 2,000 genes, are estimated to code for GPCRs. Of these, there are 
approximately 100 GPCRs for which the endogenous Ugand that binds to the GPCR has been 
identified. Because of the significant time-lag that exists between the discovery of an endogenous 
GPCR and its endogoious ligand, it can be presumed that the remaining 1,900 GPCRs wiU be 
identified and characterized long before the endogenous ligands for these receptors are idaitified. 
10 lndeed,therapidilybywhichtheHumanGenomeProjectissequencingthe 100,000 human genes 
indicates that the remaining human GPCRs will be fiiUy sequenced within the next few yeare. 
Nevertheless, and despite the efforts to sequence the human genome, it is still veiy unclear as to 
how scientists will be able to rapidly, effectively and efficiently exploit this infom^tion to 
improve and enhance the human condition. The present invention is geared towards this 
IS impoitant objective. 

Receptors, including GPCRs, for which the endogenous ligand has been identified are 
refened to as "known" receptors, while receptors for which the endogenous ligand has not been 
identified are referred to as "orphan" receptors. This distinction is not merely semantic, 
particularly in the case of GPCRs. GPCRs represent an important area for the development of 
20 pharmaceutical products: fiom approximately 20 of the 100 known GPCRs, 60% of all 
prescription pharmaceuticals have been developed. Thus, the orphan GPCRs arc to the 
pharmaceutical industry what gold was to CaUfomia in the late 19* century - an opportunity to 
drive growth, expansion, enhancement and development A serious drawback exists, however. 
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with orphan receptors relative to the discovery of novel ther;^,eutics. This is because the 
traditional approach to the discovery and development of pharmaceuticals has required access to 

both the receptor its endogenous Ugand. Thus, heretofore, orphan GPCRs have presented 
art >vith a tantalizing and undeveloped resource for the discovery of pharmaceuticals. 

Under the traditional approach tothediscoveiyofpotential therapeutics, it isgeneralty 
casethatthereceptorisfirstidentified. Before drug discovery efforts can be initiated, elaborate, 
time consuming and expensiveprocedures are t3T,ically put intoplacein order to identi^^ 
and generate the receptor's endogenous ligand - dus process can require from between 3 and ten 
years per receptor, at a cost of about $5milIion (U.S.) per receptor. These time and financial 
10 resources must be expended before the traditional approach to drug discovery can commence. 
TTiis is because traditional drug discovery techniques rely upi,n so-called "competitive binding 
assays" whereby putative therapeutic agents are "screened" against the receptor in an effort to 
discover compounds that eiflier block the endogenous ligand from binding to the recq,tor 

("antagonists"), or enhanceormimic the effects of the ligandbinding to the receptor 
15 The overall objective is to identify compounds that prevent cellular activation when the ligand 
binds to the receptor (the antagonists), or drat enhance or increase cellular activity that would 
otherwise occur if the ligand was properly binding with the receptor (the agonists). Because the 

endogenous ligandsfororphanGPaisarebydefinitionnotidentified,theabilitytodiscovernovd 
and unique therapeutics to these receptors using traditional drug discovery techniques is not 

20 possible. Thepresentinvention.aswiIlbesetforthingreaterdetailbelow.overcomestheseand 
other severe limitations created by such traditional drug discovery techniques. 

GPCRs share a common structural motif. All these receptors have seven sequences of 
between22to24hydn)phobic amino acids that form sevenalphahelices.eachofwhich spans the 
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membrane (each span is identified by number, Le„ transmembrane- 1 (TM-1), transmebrane-2 
(TM-2), etc.). The transmembrane hehces are joined by strands of amino acids between 
transmembrane-2 and transmembrane-3, transmembrane-4 and transmembrane-5, and 
transmembrane-6 and transmembrane-? on the exterior, or "extracellular" side, of the cell 
5 membrane (these are refemed to as "extt^eilular" regions 1, 2 and 3 (EC-1, EC-2 and EC-3), 
respectively). The transmembrane hehces are also joined by strands of amino acids between 
transmembrane-l and transmembrane-2, transmembrane-3 and transmembrane-4, and 
transmembrane-5 and transmembrane-6 on the interior, or "intracellular" side, of tiie cell 
membrane (these are referred to as "intracellular" regions 1, 2 and 3 (IC-1, IC-2 and IC-3), 
1 0 respectively). The "carboxy" ("C") terminus of the receptor lies in the intracellular space within 
the cell, and the "amino" ("N") temiinus of the receptor lies in the extracellular space outside of 
the cell. The general structure of G protein-coupled receptors is depicted in Figure 1 . 

Generally, when an endogenous ligand binds with the receptor (often referred to as 
"activation" of the receptor), there is a change in the conformation of the intracellular region that 
1 5 allows for coupling between the intracellular region and an intracellular "G-protein." AlUiough 
other G proteins exist, currently, Gq, Gs, Gi, and Go are G proteins that have been identified. 
Endogenous ligand-activated GPCR coupling witii the G-protein begins a signaling cascade 
process (referred to as "signal transduction"). Under noraial conditions, signal transduction 
ultimately results in cellular activation or cellular inhibition. It is thought that the IC-3 loop as 
20 well as the carboxy tmninus of the receptor interact witii the G protein. A principal focus of this 
invention is directed to the traiismembrane-6 (TM6) region and the intracellular-3 (IC3) region of 
tiieGPCR. 

Under physiological conditions, GPCRs exist in the cell membrane in equilibrium between 
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two different confonnations: an "inactive" state and an "active" state. As sho^vn schematically in 
Figure 2. a receptor in an inactive state is unable to link to the intracellular signaling transduction 
pathway toproduceabiological response. Changing thereceptorconfom^^^^ 
allows linkagetothetiansductionpathway(viatheG-protein)andpn^^ 

A receptor may be stabilized in an active state by an endogenous ligand or a compound 
such as a drug. Recent discoveries, including but not exclusively limited to modifications to the 
amino acid sequence of the receptor, provide means other than endogenous ligands or drugs to 
promote and stabilize the receptor in the active state confomration. Tlrese means effectively 
stabilize the receptor in an active state by simulating the effect of an «idoge„ous ligand binding 
10 tothereceptor. Stabilization by such ligand-independent means is tern^ed "constitutive ^^^^ 



activation." 



As noted above, the use of an orphan receptor for screening purposes has not been 
possible. Ihisisbecausethetraditional-dogma-regardingscreeningofco^^^^ 
theligandforthereceptorbeknown. By definition, then, this approach has no applicability with 
15 respect to orphan recqjtors. T^^us, by adhering to this dogmatic approach to the discovery of 
Iher^utics. the art, in essence, has taught and has been taught to forsake the use of orphan 
recq.tors unless and until the endogenous ligand for the receptor is discovered. Given that there 
are an estimated 2.000 G protein coupled receptors, the majority of which are orphan receptors, 
such dogma castigates a creative, unique and distinct approach to the discovery of therapeutics. 

20 ^°™ationregardingthenucldcaddand/oraminoaddsequencesofavarietyofGP^^ 
is summarized below in Table A Because an important focus of the invention disclosed herein 
is directed towards orphan GPCRs. many of the below-cited references are related to orphan 
GPCRs. However, this list is not intended to imply, nor is tiiis list to be construed, legaUy or 
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Otherwise, that the invention disclosed herein is only ^plicable to orphan GPCRs or the specific 
GPCRs listed below. Additionally, certain receptors that have been isolated are not the subject of 
pubUcations per se; for example, reference is madeto a G Protein-Coupled Receptor database on 
the "world-wide web" (neither the named inventors nor the assignee have any affiliation with this 
site) that lists GPCRs. Other GPCRs are the subject of patent applications owned by the present 
assignee and these are not Usted below (including GPR3, GPR6 and GPRI2; see U.S. Provisional. 
Number 60/094879): 



Table A 



Receptor Name 


Publication Reference 


GPRl 


23 Genomics 609(1994) 


GPR4 


14 DNA and Cell Biology 25 (1995) 


GPRS 


14 DNA and Cell Biology 25 (1995) 


GPR7 


28 Genomics 84(1995) 


GPRS 


28 Genomics 84 (1995) 


GPR9 


184 J. Exp. Med. 963(1996) 


GPRIO 


29 Genomics 335(1995) 


GPRl 5 


32 Genomics 462 (1996) 


GPRl 7 


70 J Neurochem. 1357 (1998) 


GPR18 


42 Genomics 462(1997) 


GPR20 


187 Gene 75(1997) 


GPR21 


187 Gene 75(1997) 


GPR22 


187 Gene 75(1997) 


GPR24 


398FEBS Lett. 253 (1996) 


GPR30 


45 Genomics 607(1997) 


GPR31 


42 Genomics 519 (1997) 


GPR32 


50 Genomics 281 (1997) 


GPR40 


239 Biochem. Biophys, 




Res. Commun. 543 (1997) 


GPR41 


239 Biochem. Biophys. 




Res. Commun. 543 (1997) 


GPR43 


239 Biochem. Biophys. 




Res. Commun. 543 (1997) 


APJ 


136 Gene 355 (1993) 


BLRl 


22 Eur. J. Immunol. 2759 (1992) 


CEPR 


23 1 Biochem. Biophys. 




Res. Commun. 651 (1997) 


EBIl 


23 Genomics 643(1994) 


EBI2 


67 J. Virol. 2209(1993) 


ETBR-LP2 


424FEBS Lett. 193 (1998) 


GPCR-CNS 


54 Brain Res. Mol. Brain Res. 152 (1998); 




45 Genomics 68(1997) 


GPR-NGA 


394 FEBS Lett. 325 (1996) 


H9 


386 FEBS Lett 219 (1996) 
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HBA954 



1261 Biochim. Biophys.Acta 121 (1995) 



HG38 


247 Biochem. Biophys. 
Res. Coimnun. 266 (1998) 


HM74 
V28 


5 Int. Immunol. 1239 (1993) 
35 Genomics 397 (1996) 
163 Gene 295(1995) 



As wiU be set forth and disclosed in greater detail below, utilization of a mutational cassette to 

modifythe endogenous sequenceofahumanGPCR leads toaconstitutively activate 
thehnnanGPCR.Thesenon-endogenous,constitutively activate 

be utilized, inter alia, for the screening of candidate compounds to directly identify compounds 
) 0^ e.g., Haetapeatic relevance. 



SUMMARY OF THE nWENTION 

Disclosed herein isanon-endogenous.humanGprotein-coupledreceptorcompiising 
(a) as a most preferml amino acid sequence region (C-temiinus to N-terminus orientation) 
and/or (b) as a most preferred nucleic acid sequence region (3 ' to 5' orientation) transversing 
the transmembrane-6 (TM6) and intracellular loop-3 (ICi) regions of the GPCR: 
(a)P'AA,5X 

wherein; 

(1) P' is an amino acid residue located within the TM6 region of 
the GPCR, where P' is selected from the group consistmg of (i) 
the endogenous GPCR's proline residue, and (ii) a non- 
endogenous amino acid residue other than proline; 

(2) AA.jare 15 amino acids selected from the group consisting of 
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(a) the endogenous GPCR's amino acids (b) non-endogenous 
amino acid residues, and (c) a combination of the endogenous 
GPCR's amino acids and non-endogenous amino acids, 
excepting that none of the 1 5 endogenous amino acid residues 
that are positioned within the TM6 region of the GPCR is 
proline; and 

(3) X is a non-endogenous amino acid residue located within the 
ICS region of said GPCR, preferably selected from the group 
consisting of lysine, hisitidine and arginine, and most 
preferably lysine, excepting that when the endogenous amino 
acid at position X is lysine, then X is an amino acid other than 
lysine, preferably alanine; 

and/or 

(b) (AA-codon),5X^ 

wherein: 

(1) P«°*» is a nucleic acid sequence within the TM6 r^on of the 
GPCR, where P^*" encodes an amino acid selected from the 
group consisting of (i) the aidogenous GPCR's proline residue, 
and (ii) a non-endogenous amino acid residue other than proline; 

(2) (AA-codon),j are 15 codons encoding 15 amino acids selected 
from the group consisting of (a) the aidogenous GPCR's amino 
acids (b) non-endogenous amino acid residues and (c) a 
combination of the endogaious GPCR's amino acids and non- 
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endogenous amino acids, excepting that none of the 15 
endogenous codons within the TM6 region of the GPCR encodes 
a proline amino acid residue; and 

0) ^nis a nucleic acid encoding region residue located within the 
5 IC3regionofsaidGPCR,whercX^encodesanon-endogenous 

amino acid, preferably selected from the group consisting of 
lysine, hisitidine and aiginine, and most preferably lysine, 
excepting that when the endogenous encoding region at position 
X.^ encodes the amino add lysine, thenJQ^ encodes an amino 
* ^ acid other than lysine, preferably alanine. 

n:e terras endogenous and non-endogenous in reference to these sequence cassettes are relative 

to the endogenousGPCR.Forexample,oncetheendogenousprolineresidue is located withinthe 
TM6 region of a particular GPCR, and the 1 6* amino acid therefrom is identified for mutation to 
constitutively activate the receptor, it is also possible to mutate the endogenous proline residue 
15 (i.e., oncethemaikerislocatedandthe 16-^ amino acid to be mutated is identified, one may mutate 
themarkeritsel0.althoughitismostpreferredthattheprolineresiduenotbemutated. Similariy, 
and while it is most preferred that AA.^ be maintained in their endogenous forms, these amino 
acids may also be mutated. The only amino acid that must be mutated in the non-endogenous 
version of the human GPCR is X Le.. the endogenous amino acid that is 16 residues from P' 
20 camiot be maintained in its endogenous form and must be mutated, as further disclosed herein. 
Stated again, while it is preferred that in the non-endogenous version of the human GPCR, P' and 
AA,5 remain in their endogenous forms (Le.. identical to their wild-type forms), once X is 
identified and mutated, any and/or all of P' and AA„ can be mutated This applies to the nucleic 
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acid sequences as well. In those cases where the endogenous amino acid at position X is lysine, 
then in the non-endogenous version of such GPCR, X is an amino acid other than lysine, 
preferably alanine. 

Accordingly, and as a hypothetical example, if the endogenous GPCR has the following 
5 oidogenous amino acid sequence at the above-noted positions: 

P-AACCTTGGRRRDDDE -Q 
then any of the following exemplary and hypothetical cassettes would fell within the scope of 
the disclosure (non-endogenous amino acids are set forth in bold): 

P-AACCTTGGRRRDDDE -K 
10 P-AACCnmGRRDDDE-K 

P-ADEETTGGRRRDDDE -A 
P-LLKFMSTWZLVAAPQ -K 
A-LLKFMSTWZLVAAPQ -K 
It is also possible to add amino acid residues within AA,,, but such an approach is not particularly 
1 5 advanced, hideed, in the most prefened embodiments, the only amino acid that diflfers in the non- 
endogenous version of the human GPCR as compared with the endogenous version of that GPCR 
is the amino acid in position X; mutation of this amino acid itself leads to constitutive activation 
of the receptor. 

Thus, in particularly preferred embodiments, P' and P"*" are endogenous proline and an 
20 endogenous nucleic add encodingregion encoding proline, respectively; and X and X^ are non- 
endogenous lysine or alanine and anon-endogenous nucleic acid encodingregion encoding lysine 
or alanine, re^ectively, with lysine being most prefened. Because it is most preferred that the 
non-aidogenous versions of the human GPCRs which incorporate these mutations are 
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mcx)iporatedmtomammaliancelkandutilizedforthescreenm 

endogaioushuman GPCR incorporating themutation need not be purified and isolated/>er je(j.e, 
these are incorporated within the cellular membrane.of a mammalian cell), although such purified 
and isolated non-endogenous human GPCRs are well within the purview of this disclosure. Gene- 
5 targeted and transgenic non-human mammals (preferably rats and mice) incorporating the non- 
endogenous human GPCRs are also within the purview of this invention; in particular, gene- 
targeted mammals are most preferred in that these animals will incorporate the non-endogenous 
versionsofthe human GPCRs inplaceofthenon-humanmammarsendogenousGPCR-encoding 

region(techniquesforgeneratingsuchnon-humanmammals to replace the non-humanmarrnnal's 
10 protein encoding region with a human encoding region are well known; see, for example, U.S. 

Patent No. 5,777,194.) 

It has been discovered that these changes to an endogenous human GPCR render the 

GPCR constitutively active such that, as will be fiirther disclosed heim. the non-endogenous, 

constitutively activated version of the human GPCR can be utilized for, inter alia, the direct 
15 screening of candidate compounds without the need for the endogenous ligand. Thus, methods 

for using these materials, and products identified by these methods are also within the purview of 

the following disclosure. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 shows a generalized structure of a G protein-coupled receptor with the numbers 
20 assigned to the transmembrane helixes, the intraceUular loops, and the extraceflular loops. 

Figure 2 schematicaUy shows the two states, active and inactive, for a typical G 
protein coupled receptor and the linkage of the active state to the second messenger 
transduction pathway. 



wo 00/22129 



PCT/US99/23938 



- 12- 

Figure 3 is a sequence diagram of the preferred vector pCMV, including restriction 
enzymen site locations. 

Figure 4 is a diagrammatic representation of the signal measured comparing pCMV, non- 
endogenous, constitutively active GPR30 inhibition of GPR6-mediated activation of CRE-Luc 
5 reporter with endogenous GPR30 inhibition of GPR6-mediated activation of CRE-Luc 
reporter. 

Figure 5 is a diagrammatic rqjresentation of the signal measured comparing pCMV, non- 
endogenous, constitutively activated GPR17 inhibition of GPR3-mediated activation of CRE- 
Luc reporter with endogenous GPR17 inhibition of GPR3-mediated activation of CRE-Luc 
10 reporter. 

Figure 6 provides diagrammatic results of the signal measured comparing control 
pCMV, endogenous APJ and non-endogenous APJ. 

Figure 7 provides an illustration of IP3 production from non-endogenous human 5- 
HTjA receptor as compared to the endogenous version of this receptor. 
15 Figure 8 are dot-blot fomiat results for GPRl (8A), GPR30 (8B) and APJ (8C). 

DETAILED DESCRIPTION 
The scientific litCTature that has evolved around recq)tors has adopted a number of temis 
to refer to ligands having various effects on receptors. For clarity and consistency, the following 
definitions will be used throughout this patent document. To the extent that these definitions 
20 conflict with other definitions for these temis, the following definitions shall control: 

AGONISTS shall mean compounds that activate the intracellularresponse when they bind 
to the receptor, or enhance GTP binding to membranes. 
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AMINO ACID ABBREVIATIONS used herein are set below: 



ALANINE 


ALA 


A 


ARGININE 


ARG 


R 


ASPARAGINE 


ASN 


N 


ASPARTICACID 


ASP 


D 


CYSTEINE 


CYS 


C 


GLUTAMIC ACID 


GLU 


E 


GLUTAMINE 


GLN 


Q 


GLYCINE 


GLY 


G 


fflSTIDINE 


HIS 


H 


ISOLEUONE 


OLE 


I 


LEUCINE 


LEU 


L 


LYSINE 


LYS 


K 


METHIONINE 


MET 


M 


PHENYLALANE^ 


PRE 


F 


PROLINE 


PRO 


P 


SERINE 


SER 


S 


THREONINE 


IHR 


T 


TRYPTOPHAN 


TRP 


W 


TYROSINE 


TYR 


Y 


VALINE 


VAL 


V 



PARTIAL AGONISTS shall mean compounds which activate the intracellular response 
when th^ bind to the receptor to a lesser degree/extent than do agonists, or enhance GTP binding 
to membranes to a lesser degree/extent than do agonists 

ANTAGONIST shall mean compounds that competitively bind to the receptor at the 
same site as the agonists but which do not activate the intracellular response initiated by the active 
form of ihe receptor, and can thereby inhibit die intracellular responses by agonists or partial 
agonists. ANTAGONISTS do not diminish the baseline intracellular response in the absence of 
an agonist or partial agonist. 

CANDIDATE COMPOUND shall mean a molecule (for example, and not limitation, 
a chemical compound) which is amenable to a screening technique. Preferably, the phrase 
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"candidate compound" does not include compounds which were publicly known to be compounds 
selected fiom the group consisting of inverse agonist, agonist or antagonist to a receptor, as 
previously determined by an indirect identification process ("indirectly identified compound"); 
more preferably, not including an indirectly identified compound which has previously been 
5 determined to have therapeutic efficacy in at least one mammal; and, most preferably, not 
including an indirectly identified compound which has previously been determined to have 
therapeutic utility in humans. 

CODON shall mean a grouping of three nucleotides (or equivalents to nucleotides) which 
generally comprise a nucleoside (adenosine (A), guanosine (G), cytidine (C), uridine (U) and 
1 0 thymidine (T)) coupled to a phosphate group and which, when translated, encodes an amino acid. 

COMPOUND EFFICACY shall mean a measurement of the ability of a compound to 
inhibit or stimulate receptor functionality, as opposed to receptor binding affinity. A preferred 
means of detecting compound efficacy is via measurement of, e.g., p^S]GTPyS binding, as further 
disclosed in the Example section of this patent document. 
15 CONSTTTUTIVELY ACTIVATED RECEPTOR shall mean a receptor subject to 

constitutive recqjtor activation. In accordance with the invention disclosed h&ccm, a non- 
endogenous, human constitutively activated G protein-coupled receptor is one that has been 
mutated to include the amino acid cassette P*AA,5X, as set forth in greater detail below. 

CONSinUilVE RECEPTOR ACTIVATION shall mean stabilization of a receptor 
20 in the active state by means other than binding of the receptor with its endogenous ligand or a 
chemical equivalent thereof Preferably, a G protein-coupled receptor subjected to constitutive 
receptor activation in accordance with the invention disclosed herein evidences at least a 10% 
difference in response (increase or decrease, as the case may be) to the signal measured for 
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constitutive activation as compared with the endogenous fonn of that GPCR. more preferably. 

abouta25%diflFerenceinsuchconiparativeresponse.andmost preferably aboutaS^^^^ 
in such comparative response. When used for the purposes of direcUy identifying candidate 
compounds, it is most preferred that the signal difference be at least about 50% such that there is 
5 a sufficient difference between the endogenous signal and the non-endogenous signal to 
differentiate between selected candidate compounds. In most instances, the "difference" will be 
an increase in signal; however, with respect to Gs-coupled GPCRS, the "difference" measured is 
preferably a decrease, as will be set forth in greater detail below. 

CONTACT or CONTACTING shaU mean bringing at least two moieties together, 
10 whether man in vitro sj^tem or an in vivo systan. 

DIRECrLY IDENTIFYING or DIRECTLY IDENTIFIED, in relationship to the 
phrase "candidate compound", shaU mean the screening of a candidate compound against a 
constitiitively activated G protein-coupled receptor, and assessing the compound efficacy of such 
compound This phrase is, under no dreumstances. to be mteipreted or underetood to be 
1 5 encompassed by or to encompass the phrase "indirectly identifying" or "indirectiy identified." 

ENDOGENOUS shall mean a material that is naturaUy produced by the genome of the 
species. ENDOGENOUS in reference to. for example and not limitation, GPCR, shaU mean that 
which is naturally produced by a human, an insect, a plant, a bacterium, or a vims. By contrast, 

the temiNON-ENDOGENOUS in tiiis context ShaU mean that which is not naturally produced 
20 by the genome of a species. For example, and not limitation, a receptor which is not 
constitutively active in its endogenous fomi, but when mutated by using tiie cassettes disclosed 

hereinandtiiereafterbecomesconstitutively active. ismostpreferablyrefetredtohereinasa"non- 
endogenous, constitiitively activated receptor." Both temis can be utilized to describe both "in 
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vivo" and "in vitro" systems. For exan^le, and not limitation, in a screening ^roach, the 
endogenous or non-endogenous receptor may be in reference to an in vitro screening system 
wheidjy the receptor is raqjressed on the cell-surface of a mammalian cell. As a further example 
and not limitation, where the genome of a mammal has been manipulated to include a non- 
5 endogenous constitutively activated receptor, screening of a candidate compound by means of an 
in vivo system is viable. 

HOST CELL shall mean a cell enable of having a Plasmid and/or Vector incorporated 
tiierein. In the case of a prokaryotic Host Cell, a Plasmid is typically replicated as an autonomous 
molecule as the Host Cell replicates (generally, the Plasmid is diereafter isolated for introduction 
10 into a eukaiyotic Host CeU); in the case of a eukaryotic Host Cell, a Plasmid is integrated into the 
cellular DNA of the Host Cell such that when the eukaryotic Host Cell replicates, die Plasmid 
rqilicates. Preferably, for the purposes of the invention disclosed herein, tiie Host Cell is 
eukaryotic, more preferably, mammalian, and most preferably selected fiom the groiqj consisting 
of 293, 293T and COS-7 cells. 
15 INDIRECTLYroENTIFYINGorlProiRECrLYroEIsrnFffi 

qjproach to the drag discovery process involving identification of an endogenous ligand specific 
for an endogenous receptor, screening of candidate compounds against the receptor for 
detemiination of those which interfere and/or compete with the iigand-receptor interaction, and 
assessing the eflBcacy of the compound for affecting at least one second messenger pathway 
20 associated with die activated receptor. 

INHIBIT or INHIBniNG, in relationship to die terni "response" shall mean that a 
reqxjnse is decreased orprevented in the presence of a compound as opposed to in the absence of 
the conq)ound. 
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INVERSEAGOMSTSshaU mean compounds wWch bind toeithertheendogenousfi 
of the receptor or to the constitutively activated fonn of the receptor, and which inhibit the 
baseline intraceUular response initiated by the active fonn of the receptor below the nonnal base 
level of activity which is observed in the absence of agonists or partial agonists, or decrease GTP 
5 binding to membranes. Preferably, the baseline intraceUular response is inhibited in the presence 

ofthe inverse agonist by at least 30%, more preferablybyat least 50%, and most preferably by at 
least 75%, as compared with the baseline response in the absence of the inveree agonist. 

KNOWN RECEPTOR shall mean an endogenous receptor for which the endogenous 
ligand specific for that receptor has been identified. 
10 LIGAND shaU mean an endogenous, naturally occuning molecule specific for an 

endogenous, naturally occuning receptor. 

MUTANT or MUTATION in reference to an endogenous receptor's nucleic acid and/or 

amino acidsequerice shall meanaspecifiedchangeorchanges to such endogenous sequences such 
thatamutatedfonnofan endogenous, non-constitutivelyactivatedreceptorevidencesconstitutive 
15 activation of the receptor. In tenns of equivalents to specific sequences, a subsequent mutated 
form of a human receptor is considered to be equivalent to a fim mutation of the human receptor 
if (a) the level of constitutive activation of the subsequent mutated form of the r^eptor is 
substantially the same as that evidenced by the first mutation of the receptor, and (b) the percent 
sequence (amino add and/or nucleic acid) homology between the subsequent mutated form of the 

20 receptorandthefirstmutationofthereceptorisatleastabout80%.morepreferablyatleastabom 
90% and most preferably at least 95%. IdeaUy. and owing to the fact that the most preferred 

cassettes disclosed hereinfor achieving constitutive activationincludesasinglea^^ 
codon change between the endogenous and the non-endogenous fomis of the GPCR (i.e. X 



or 
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Xcodon)) the percent sequence homology should be at least 98%. 

ORPHAN RECEPTOR shall mean an endogenous receptor for which the endogenous 
ligand specific for that receptor has not been identified or is not known. 

PHARMACEUTICAL COMPOSITION shall mean a composition comprising at least 
S one active ingredient, whmby the composition is amenable to investigation for a specified, 
efficacious outcome in a mammal (for example, and not limitation, a human). Those of ordinary 
skill in the art will understand and appreciate the techniques appropriate for determining whether 
an active ingredient has a desired efficacious outcome based upon the needs of the artisan. 

PLASMID shall mean the combination of a Vector and cDNA. Generally, a Plasmid is 
10 introduced into a Host Cell for the purpose of replication and/or expression of the cDNA as a 
protein. . 

STIMULATE orSTIMULATING,in relationship to the temi "response" shall mean that 
a response is increased in the presence of a compound as opposed to in the absence of the 
conqiound. 

15 TRANSVERSE or TRANSVERSING, in reference to either a defined nucleic acid 

sequence or a defined amino acid sequence, shall mean that the sequence is located within at least 
two different and defined regions. For example, in an amino acid sequence that is 1 0 amino acid 
moieties in length, where 3 of the 1 0 moieties are in the TM6 region of a GPCR and the remaining 
7 moieties are in the ICS region of the GPCR, the 10 amino acid moiety can be described as 

20 transversing the TM6 and ICS regions of flie GPCR. 

VECTOR in reference to cDNA shall mean a circular DNA capable of incorporating at 
least one cDNA and capable of incorporation into a Host Cell. 

The order of the following sections is set forth for presentational efficiency and is not 
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intended, nor should be construed, as a limitation on the disclosure or the claims to follow. 
A. Introduction 

The traditional study of receptors has ahvays proceeded fiom the a priori assumption 
(historically based) that the endogenous ligand must first be identified before discovery could 
5 proceed to find antagonists and other molecules that could affect the receptor. Even in cases 
where an antagonist might have been known first, the search immediately extended to looking for 
the endogenous ligand. This mode of thinking has peraisted in receptor research even after the 
discovery of constitatively activated receptors. What has not been heretofore recognized is that 

it is the activestateofthereceptorthatismostusefiU for discoveringagonists, partial agonists, and 
10 inverse agonists of thereceptor. For those diseases which result fiom an overly active receptor or 
an under-active receptor, what is desired in a therapeutic drug is a compound which acts to 
diminish the active state of a receptor or enhance the activity of the receptor, respectively, not 
necessarily a drug which is an antagonist to the endogenous Ugand. This is because a compound 
tiiat reduces or enhances the activity of die active receptor state need not bind af die same site as 
15 the endogenous ligand. Thus, as taught by a method ofthis invention, any search for therapeutic 
compounds should start by screening compounds against the ligand-independent active state. 

Screeningcandidate compounds againstnon-endogenous, constitutively activated GPCRs 
allows for the direct identification of candidate compounds which act at these ceU surfice 
receptors, without requiring any prior knowledge or use of the receptor's endogenous hgand. By 
20 determining areas within the body where the endogenous version of such GPCRs are expressed 
and/or over-expressed, it is possible to determine related disease/disorder states which are 
associated widi tire expression and/or over-expression of tiiese receptors; such an approach is 
disclosed in tiiis patait documoit. 
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B« Disease/Disorder Identification and/or Selection 

Most preferably, inverse agonists to the non-endogenous, constituti vely activated GPORs 
can be identified using the materials of this invention. Such inverse agonists are ideal candidates 
as lead compounds in drug discovery programs for treating diseases related to these receptors. 
Because of the ability to directly identify inverse agonists, partial agonists or agonists to these 
receptors, thereby allowing for the development of pharmaceutical compositions, a search, for 
diseases and disorders associated with these receptors is possible. For example, scanning both 
diseased and normal tissue samples for the presence of these receptor now becomes more than an 
academic exercise or one which might be pursued along the path of identifying, in the case of an 
orphan receptor, an endogenous ligand. Tissue scans can be conducted across a broad range of 
healthy and diseased tissues. Such tissue scans provide a preferred first step in associating a 
specific recq)tor with a disease and/or disorder. 

Preferably, the DNA sequence of the endogenous GPCR is used to make a probe for either 
radiolabeled cDNA or RT-PCR identification of the expr^sion of the GPCR in tissue samples. 
The presence of a receptor in a diseased tissue, or the presence of the receptor at elevated or 
decreased concentrations in diseased tissue compared to a normal tissue, can be preferably utilized 
to identify a conrelation with that disease. Receptors can equally well be localized to regions of 
organs by this technique. Based on the known fimctions of the specific tissues to which the 
receptor is localized, the putative fimctional role of the receptor can be deduced. 

C A "Human GPCR Proline Marker" Algoritiim and die Creation of 
Non-Endogenous, Constitutively-Active Human GPCRs 

Among the many challenges facing the biotechnology arts is the unpredictability in 
gleaning genetic information fiiom one species and correlating that information to another species 
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- nowhere in this art does this problem evidence more annoying exaceAation flian in the genetic 

sequences that encodenudeic acids andpioteins-ThusJor consistency and b«^^ 
unpredictablenatureofthisart,thefollowinginventionislimit«^ 

GPOls-qjpUcabilityofthis invention toothermammalian species, whileap^^^ 
5 is considered beyond mere rote ^plication. 

hi general, when attempting to ^ly common "mles" fiom one related protein sequence 
to another or fiomonespedes to another.thearthastypicallyresorted to sequence ali^^ 
sequencesarelinearizedandattemptsarethenmadetofindiegionsofcommonalitybe^^^ 
or more sequences. While useful, this approach does not always prove to result in meaningful 

10 infomration. ^ the case ofGPCRs. while the general stmcturBlmotifis identical for all GPCR^, 
the variations in lengths of the TMs, ECs and ICs make such alignment approaches fiom one 

GPCR to another difficult at best. Thus. wMleitmay be desirable to applyaconastenta^^^^ 
to. e.g.. constitutive activation fiom one GPCR to another, because of the great diversity in 
sequence length, fidelity, etc fiom one GPCR to the next, a generaUy applicable, and readily 
15 successfiil mutational ahgnment approach is in essence not possible, hi an analogy, such an 
approach is akin to having a traveler start a journey at point A by giving the traveler dozens of 
diflFerent maps to point B. without any scale or distance markets on any of the maps, and then 
asking the traveler to find the shortest and most efficient route to destination B only by using the 
maps, fa such a situation, the task can be readily simplified by having (a) a common "place- 
20 mailcer" on each map. and (b) the abihty to measure the distance fiom the place-marker to 

destinationB-this, then, will aUow the traveler to select themost efficient fiom starting-pom^ 
to destination B. 

hi essence,afeatureofthe invention is to providesuch coordinates within human GPCRs 
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that readily allows for creation of a constitutively active forrn of the human GPCRs. 

As those in the art appreciate, the transmembrane region of a cell is highly hydrophobic; 
thus, using standard hydrophobicity plotting techniques, those in the art are readily able to 
determine the TM regions of a GPCR, and specifically TM6 (this same approach is also 
5 applicable to deteraiming the EC and IC regions of the GPCR). It has been discovered that within 
the TM6 region of human GPCRs, a common proline residue (generally near the middle of TM6), 
acts as a constitutive activation "marker." By counting 15 amino acids ftom the proline marker, 
the 16*** amino acid (which is located in the IC3 loop), when mutated from its endogenous form 
to a non-endogenous form, leads to constitutive activation of the receptor. For convenience, we 
10 refer to this as the "Human GPCR Proline Marker" Algoritiim. Although the non-endogenous 
amino acid at this position can be any of the amino acids, most preferably, tiie non-endogenous 
amino acid is lysine. While not wishing to be bound by any theory, we believe that tiiis position 
itself is unique and that the mutation at tiiis location impacts tiie receptor to allow for constitutive 
activation. 

1 5 We note that, for example, when tiie endogenous amino acid at tiie 1 6*** position is ateady 

lysine (as is tiie case witii GPR4 and GPR32), flien in order for X to be a non-endogenous amino 
add, it must be otiier flian lysine; tiius, in tiiose situations where tiie endogenous GPCR has an 
endogenous lysine residue at tiie 16* position, tiie non-endogenous version of tfiat GPCR 
preferably incoiporates an amino acid otiier than lysine, preferably alanine, histidine and arginine, 

20 at this position. Of furtiier note, it has been determined tiiat GPR4 appears to be linked to Gs and 
active in its endogenous form (data not shown). 

Because tiiCTe are only 20 naturally occurring amino acids (altiiough tiie use of non- 
naturally occurring amino acids is also viable), selection of a particular non-endogenous amino 
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acid for substitution at this 16* position is viable and aUows for efficient selection of a non- 
endogenous amino acid that fits the needs of the inv^tigator. However, as noted, the mor. 
pr^fened non-endogenous amino acids at the 16* position ar. lysine, hisitidine. arginine and 
alanine, with lysine being most prefermi Those of ordinao^ skill in the art ar. ci«Iited with Ae 
5 ability to readily detemiine proficient methods for changing the sequence of a codon to achieve 
a desired mutation. 

It has also been discovered that occasionally, but not always, the proline residue marker 

will be preceded in TM6 by W2 « e W2P*AA Y\^xf\^^\\r• . ^ 

uy wzi' AA,5X) where W is tryptophan and 2 is any amino 

acid residue. 

10 °"^^«^-^.^ng^otherthings,negatesti,eneedforunp,edictable«^^ 
secpienceaHgmnent approachescommonlyused by the art. Indeed, the s^^ 
while an algorithm in nature, is that it can be applied in a lacile manner to human GPCRs, with 

dexteroussimplidtybythoseintheait,toacWeveauniqueandhighlyuseful^^^^^ 
constitutively activated versionofahuman(PCR.Becausemanyyearsan^ 
15 ^f'^^neywiU be required to detemiine the endogenous ligands for the human GPC^ 
HumanGenomeprojectisuncovering,d>edisclosedinventionnotonlyreducestheti^^ 
topositivelyexploittMssequenceinfom.ation,butatsignificantcost-sa^^^ Ihis approach tmly 
validates the importance of the Human Genome Project because it allows for the utilization of 
genetic infommtion to not only understand the role of the GPCRs in, e.g.. diseases, but also 
20 provides the opportunity to improve the human condition. 
D. Screening of Candidate Compounds 

1. Generic GPCR screening assay techniques 

When a G protein receptor becomes constitutively active, it couples to a G pmtein (eg., 
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Gq, Gs, Gi, Go) and stimulates release and subsequent binding of GTP to the G protein. The G 
protein then acts as a GTPase and slowly hydrolyzes the GTP to GDP, whereby the receptor, 
under normal conditions, becomes deactivated. However, constitutively activated receptors, 
including the non-endogenous, human constitutively active GPCRs of the present invention, 
5 continue to exchange GDP for GTP. A non-hydrolyzable analog of GTP, p^SjGTPyS, can be 
used to monitor enhanced binding to G proteins present on membranes which express 
constitutively activated receptors. It is reported that PS]GTPyS can be used to monitor G protein 
coupling to membranes in the absence and presence of ligand. An example of this monitoring, 
among other examples well-known and available.to those in the art, was reported by Traynor and 
10 Nahorski in 1995. The preferred use of this assay system is for initial screening of candidate 
compounds because .the system is generically applicable to all G protein-coupled receptors 
regardless of the particular G protein that interacts with the intracellular domain of the receptor. 

B 2. Specific GPCR screening assay techniques » 

C Once candidate compounds are identified using the "generic" G protein- 
15 coupled receptor assay (/.e., an assay to select compounds that are agonists, partial 
agonists, or inverse agonists), further screening to confum that the compounds have 
interacted at the receptor site is preferred. For example, a compound identified by the 
"generic" assay may not bind to the receptor, but may instead merely "uncouple" the G 
protein from the intracellular domain. 

20 a. GsandGL 

Gs stimulates the enzyme adenylyl cyclase. Gi (and Go), on the other hand, 
inhibit this enzyme. Adenylyl cyclase catalyzes the conversion of ATP to cAMP; thus, 
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constitutively activated GPCRs that couple the Gs protein are associated with increased 
cellular levels of cAMP. On the other hand, constitutively activated GPCRs that couple the 
Gi (or Go) protein are associated with decreased cellular levels of cAMP. See. generally, 
"Indirect Mechanisms of Synaptic Transmission," Chpt. 8, From Neuron Tn Rr.;. (3rd g^.) 
5 Nichols. J.G. et al eds. Sinauer Associates. Inc. (1992). Thus, assays that detect cAMP can 
be utilized to determine if a candidate compound is, e.g.. an inverse agonist to the receptor 
(/.e.. such a compound would decrease the levels of cAMP). A variety of approaches known 
in the art for measuring cAMP can be utilized; a most preferred approach rdies upon the use 
of anti-cAMP antibodies in an ELISA-based fonnat. Another type of assay that can be 
10 utilizedisawholecellsecondmessengerreportersystemassay. Pmmoteis on gaies drive the 
expression of the proteins that a particdar gene encodes. CycUc AMP drives gene ^ 
promoting thebindingofacAMP-responsiveDNAbindingproteinortianscriptionfa^^^ 
which then binds to the promoter at specific sites caUed cAMP response elements and drives the 
expression of the gene. Reporter systems can be constructed which have a promoter containing 
15 multiple CAMP response elements before the reporter gene, e.g., P-galactosidase or luciferase. 
Thus, a constitutively activated Gs-linked receptor causes the accumulation of cAMP that then 
activates the goie and expression of the reporter protein. ITie rq,orter protein such as P- 
galactosidase or luciferase can then be detected using standard biochemical assays (Chen et al. 
1995). With respect to GPCRs that Unk to Gi (or Go), and thus decrease levels of cAMP. an 
20 approachtothescreeningof;eg.. inverse agonists, based upon utilization ofreceptois that link to 

•Gs(and thus increase levels of CAMP) is disclosed in the Examplesection with respect to GPRH 
andGPR30. 
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b. Go and Gq. 

Gq and Go are associated with activation of the enzyme phosphohpase C, 
which in turn hydrolyzes the phospholipid Pffj* releasing two intracellular messengers: 
diacycloglycerol (DAG) and inistol 1,4,5-triphoisphate (IP3). Increased accumulation of IP3 
5 is associated with activation of Gq- and Go-associated receptors. See, generally, "Indirect 
Mechanisms of Synaptic Transmission," Chpt. 8, From Neuron To Brain (3"^ Ed.) Nichols, 
J.G. et al eds. Sinauer Associates, Inc. (1992). Assays that detect IP3 accumulation can be 
utilized to determine if a candidate compound is, e.g,, an inverse agonist to a Gq- or Go- 
associated receptor (Le., such a compound would decrease the levels of IP3). Gq-associated 

10 receptors can also been examined using an API reporter assay in that Gq-dependent 
phosphohpase C causes activation of genes containing API elements; thus, activated Gq- 
associated receptors will evidence an increase in the expression of such genes, whereby 
inverse agonists thereto will evidence a decrease in such expression, and agonists will 
evidence an increase in such expression. Commercially available assays for such detection 

15 are available. 

E. Medicinal Chemistry 

Generally, but not always, direct identification of candidate compounds is preferably 
conducted in conjunction with compounds generated via combinatorial chemistry techniques, 
whereby thousands of compounds are randomly prepared for such analysis. Generally, the 
20 results of such screening will be compounds having unique core structures;, thereafter, these 
compounds are preferably subjected to additional chemical modification around a preferred 
core structure(s) to further enhance the medicinal properties thereof Such techniques are 
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known to those in the art and will not be addressed in detail in this patent document 

F. Pharmaceutical Compositions 

Candidate compounds selected for further development can be fomiulated into 
phamiaceutical compositions using techniques well known to those in the art. Suitable 
5 Ph^niaceuticaUy-accqjtablecamersaieavailabletothoseintheart;^ 

PhamMceutical Sciences, 16* Edition, 1980, Mack PubUshing Co., (Oslo et al.. eds.) 

G. Other Utility 

Although aprefeired useof the non-endogenous versions of the disclosed human GPCRs 
is for the direct identification of candidate compounds as inva^ agonists, agonists or partial 
10 agonists (preferably for use as phamiaceutical agents), these receptors can also be utilized in 
research settings. For example, in vitro and in vivo systems incon^rating these receptors can be 
utilized to further elucidate and understand the roles of the receptore in the human condition, both 
normal and diseased, as well understanding the role of constitutive activation as it applies to 
understanding the signaling cascade. A value in these non-endogenous receptors is that their 
15 utility as a research tool is enhanced in that, because of their unique features, the disclosed 
receptors can be used to understand the role of a particular receptor in the human body before the 
endogenous Ugand therefor is identified. Other uses of the disclosed receptors will become 
apparent to those in the art based upon, inter alia, a review of this patent document. 



EXAMPLES 

20 The following examples are presented for purposes of elucidation, and not limitation, 

of the present invention. Following the teaching of this patent document that a mutational 
cassette may be utilized in the IC3 loop of human GPCRs based upon a position relative to 
aprolineresiduein™6toconstitutivelyactivatethereceptor,andwhilespecificnucleicacid 
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and amino acid sequences are disclosed herein, those of ordinary skill in the art are credited 

with the ability to make minor modifications to these sequences while achieving the same or 

substantially similar results reported below. Particular approaches to sequence mutations are 

within the purview of the artisan based upon the particular needs of the artisan. 

5 Example 1 

Preparation of Endogenous Human GPCRs 

A variety of GPCRs were utilized in the Examples to follow. Some endogenous human 

GPCRs were graciously provided in expression vectors (as acknowledged below) and other 

endogenous human GPCRs were synthesized de novo using publicly-available sequence 

10 infomiation. 

L GPRl (GenBank Accession Number: U13666) 

The human cDNA sequence for GPRI was provided in pRcCMV by Brian 
O'Dowd (University ofToronto). GPRl cDNA(1.4kB fragment) was excised from die pRcCMV 
vector as a Ndel-Xbal fragment and was subcloned into the Ndel-Xbal site of pCMV vector {see 
15 Figure 3). Nucleic acid (SEQ.ID.NO.: 1) and amino acid (SEQ.ID.NO.: 2) sequences for human 
GPRl were thereafter detemiined and verified, 

2. GPR4 (GenBank Accession Numbers: L36148, U35399, U21051) 
The human cDNA sequence for GPR4 was provided in pRcCMV by Brian 

O'Dowd (University of Toronto). GPRl cDNA(1.4kB fragment) was excised fix)m the pRcCMV 
20 vector as an Apal(blunted)-Xbal fragment and was subcloned (with most of the 5 ' untranslated 
region removed) into Hindin(blunted)-Xbal site of pCNfV vector. Nucleic acid (SEQ.ID.NO. : 3) 
and amino acid (SEQ.ID.NO.: 4) sequences for human GPR4 were thereafter determined and 
verified. 

3. GPRS (GenBank Accession Number: L36149) 
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The cDNA for human GPR5 was generated and cloned into pCMV expression 
vector as follows: PGR was performed using genomic DNA as template and rTth polymerase 
(PerkinElmer)withthebufFersysterapiiovidedbythemanu6ctuier,0.25MM 

02mMofeachofthe4nucleotides.Thecycleconditionwas30cycIesof:94°Cforlmin;64«'C 
5 forlmin;and72'Cforl.5min. The 5' PGR primer contained an EcoRI site with the sequence: 
5'-TATGAATTCAGATGCTCTAAACGTCCCTGC-3' (SEQ.ID.NO.: 5) 
and the 3 • primer contained BamHI site with the sequence: 

5'-TCCGGATCCACCTGCACCTGCGCCTGCACC-3' (SEQ.ID.NO.: 6). 

The 11 kbPCRfragment was digested withEcoRIandBamHI and cloned into EcoRI-BamHI 
10 site ofPCMV expression vector. Nucleic acid (SEQ.ID.NO.: 7) and amino acid (SEQ.ID.NO.: 

8) sequences for human GPRS were thereafter determined and verified. 

4. GPR7 (GenBank Accession Number: U22491) 

The cDNA for human GPR7 was generated and cloned into pCMV expression 

vector as follows: PCRcondition-PCRwasperfomied using genomic DNAastemplateandrTth 
15 Pol3raen.se(PerkinElmer)withthebuffersystemprovidedbythemanufacturer,0.^ 

primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94''C for 

lmin;62«Gforlmin;and72'Cforlminand20sec.ll»e5'PCRprimercontainedaHindm 
with the sequence: 

5'.GCAAGCTrGGGGGACGCCAGGTCGCCGGCT-3' (SEQ.ID.NO.: 9) 
20 and the 3' primer contained a Bamffl site with the sequence: 

5'-GCGGATCCGGACGCTGGGGGAGTCAGGCTGC-3' (SEQ.iD.NO.: 10). 

Thel.lkbPGRfiagmentwasdigestedwithHindfflandBammandcIonedintoHindin-BamHI 
siteofpCMV expression vector. Nucleic acid (SEQ.IDm: 11) and amino acid (SEQ.ID.NO.: 
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12) sequences for human GPR7 were thereafter determined and verified. 

5. GPRS (GenBank Accession Number: U22492) 

The cDNA for human GPRS was generated and cloned into pCMV expression 
vector as follows: PCR was performed using genomic DNA as template and rTth polymerase 
5 (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 |iM of each primer, and 
0.2 mM of each of the 4 nucleotides; The cycle condition was 30 cycles of: 94°C for 1 min; 6TC 
for Imin; and 72 °C for Imin and 20 sec. The 5' PCR primer contained an EcoRI site with the 
sequence: 

5'.CGGAATTCGTCAACGGTCCCAGCTACAATG-3' (SEQ.ID.NO.: 13). 

1 0 and the 3 ' primer contained a BamHI site with the sequence: 

5'.ATGGATCCCAGGCCCTTCAGCACCGCAATAT-3XSEQ.ID.NO.: 14). 
The 1.1 kb PCR fi:agment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 
site of PCMV expression vector. All 4 cDNA clones sequenced contained a possible 
polymoiphism involving a change of amino acid 206 finom Arg to Gin. Aside fix>m this 

15 difference, nucleic acid (SEQ.ID.NO.i 15) and amino acid (SEQ.ID.NO.: 16) sequences for human 
GPRS were thereafter determined and verified. 

6. GPR9 (GenBank Accession Number: X95876) 

The cDNA for human GPR9 was generated and cloned into pCMV expression 
vector as follows: PCR was performed using a clone (provided by Brian O'Dowd) as template and 
20 pfii polymerase (Stratagene) with the buffer system provided by the manufacturer supplemented 
with 10% DMSO, 0.25 |iM of each primer, and 0.5 mM of each of the 4 nucleotides. The cycle 
condition was 25 cycles of: 94°C for 1 min; 56**C for Imin; and 72 "^C for 2.5 min. The 5' PCR 
primer contained an EcoRI site with the sequence: 
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5'-ACGAATTCAGCCATGGTCCTTGAGGTGAGTGACCACCAAGTGCTAAAT-3' 
(SEQ.ID.no.: 17) 

and the 3' primer contained a Bamffl site with the sequaice: 
5'-GAGGATCCTGGAATGCGGGGAAGTCAG-3' (SEQ.ro.NO.: 18). 
5 nie 12 kb PGR fiagment was digested with EcoRI and cloned into EcoRI-Smal site of PCMV 
expiession-vector. Nucleic acid (SEQ.ID.NO.: 19) and amino acid (SEQJD.NO.: 20) sequences 
for human GPR9 were thereafter detemiined and verified. 

7. GPR9-6 (GeoBank Accession Number: U45982) 
The cDNA for human GPR9-6 was generated and cloned into pCMV expression 
10 vector as follows: PGR was perfonned using genomic DNA as template and rTth polymerase 
(Peridn Elmer) with the buffer system provided by the manufectuier, 0.25 fiM of each primer, and 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94»C for 1 min; 62°C 
for Imin; and 72 for 1 min and 20 sec. The 5' PGR primer was kinased with the sequence: 
5'-TTAAGCTTGACCTAATGCCATCTTGTGTCC-3' (SEQ.ID.NO.: 21) 
15 and the 3' primer contained a BamHI site with the sequaice: 

5'-TTGGATCCAAAAGAACCATGCACCTCAGAG-3- (SEQ.ID.NO.: 22). 
The 1.2 kb PGR fiagment was digested with BamHI and cloned mto EcoRV-BamHI site of 
pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 23) and amino acid (SEQ.ID.NO.: 24) 
sequences for human GPR9-6 were thereafter detemiined and verified. 

*• GPRIO (GenBank Accession Number: U32672) 
The human cDNA sequence for GPRIO was provided in pRcCMV by Brian 
O'Dowd (University of Toronto). GPRIO cDNA (UkB fiagment) was excised fiom the 
pRcCMV vector as an EcoRI-Xbal fiagment and was subcloned into EcoRI-Xbal site of pCMV 
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vector. Nucleic acid (SEQ,ID.NO.: 25) and amino acid (SEQ.ID.NO.: 26) sequences for human 
GPRl 0 were thereafter determined and verified. 

9. GPR15 (GenBank Accession Number: U34806) 

The human cDNA sequence for GPRl 5 was provided in pCDNA3 by Brian 
5 O'Dowd (University of Toronto). GPRl 5 cDNA (1.5kB fi^gment) was excised &om the 
pCDNA3 vector as a HindlH-Bam Augment and was subcloned into HindDI-Bam site of pCMV 
vector. Nucleic acid (SEQ.ID.NO.: 27) and amino acid (SEQ.E).NO.: 28) sequences for human 
GPRl 5 were thereafter determined and verified. 

10. GPR17 (GenBank Accession Number: Z94154) 

1 0 The cDNA for human GPRl 7 was generated and cloned into pCMV expression 

vector as follows: PCR was performed using genomic DNA as template and rTth polymerase 
(Peridn Elmer) with the buflFer system provided by the manufacturer, 0.25 jxM of each primer, and 
0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; for 
Imin and 72 °C for 1 min and 20 sec. The 5' PCR primer contained an EcoRI site with the 

IS sequence: 

5'^AGAATTCTGACTCCAGCCAAAGCATGAAT-3' (SEQ.ID.NO.: 29)andthe3' primer 
contained a BamHI site with the sequence: 

5'.GCTGGATCCTAAACAGTCTGCGCTCGGCCT-3* (SEQ.ID.NO.: 30). 
The 1.1 kb PCR fiiagment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 
20 site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 3 1 ) and amino acid (SEQ.ID.NO.: 
32) sequences for human GPRl 7 were thereafter determined and verified. 

11. GPR18 (GenBank Accession Number: L42324) 

The cDNA for human GPRl 8 was generated and cloned into pCMV expression 
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vector as foUows: PGR was perfonned using genomic DNA as template and rTth polymerase 
(Perkin Elmer) with the buflTer system provided by the manufacturer, 0.25 ^M of each primer, and 
02 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 54"C 
for Imin; and 72 'C for Imin and 20 sec. The 5' PGR primer was kinased witii the sequence: 
5 5'-ATAAGATGATCACCCTGAACAATCAAGAT -3' (SEQ.ID.NO.: 33) 
and the 3 ' primer contained an EcoRI site with the sequraice: 
5'-TCCGAATTCATAACATTTCACTGTTTATATTGC-3' (SEQJD.NO.: 34). 
The 1.0 kb PGR fragment was digested with EcoRI and cloned into blunt-EcoRI site of pCMV 
expression vector. All 8 cDNA clones sequenced contained 4 possible polymoiphisms involving 
10 changes of amino acid 12 fiom Thr to Pro, amino acid 86 fiom Ala to Glu, amino acid 97 from 
He to Leu and amino acid 310 fiom Leu to Met. Aside fiom these changes, nucleic acid 
(SEQ.ID J«IO.: 35) and amino acid (SEQ.ID.NO.: 36) sequences for human GPRl 8 were thereafter 
determined and verified. 

12. GPR20 (GenBank Accession Number: U66579) 
^ ^ The cDNA for human GPR20 was generated and cloned into pGMV expression 

vector as follows: PGR was performed using genomic DNA as template and rTth polymerase 
(Peridn Elmer) with the buffer system provided by the manufacturer, 0.25 nM of each primer, and 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°G for I min; 62''G 
• for Imin; and 72 °C for 1 min and 20 sec. The 5' PGR primer was kinased with the sequence: 
20 5'-GGAAGGTTGGAGGGGTGGGGTGTGGTGG-3' (SEQ.ID.NO.: 37) 
and the 3' primer contained a BamHI site witfi the sequoice: 
5'-ATGGATGGTGAGGTTGGGGGGGTGGGAGA-3' (SEQJD.NO.: 38). 
The 12 kb PGR fi:agment was digested with BamHI and cloned into EcoRV-BamHI site of 
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PCMV expression vector. Nucleic acid (SEQ.ID.NO.: 39) and amino acid (SEQ.ID.NC).: 40) 
sequences for human GPR20 were thereafter determined and verified. 

13. GPR21 (GenBank Accession Number: U66580) 
The cDNA for human GPR21 was generated and cloned into pCMV expression 
S vector as follows: PGR was perfomied using genomic DNA as template and rTth polymerase 
(Perkin Ebner) with the buffer system provided by the manufacturer, 0.25 of each primer, and 
0,2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94**C for 1 min; 62*'C 
for Imin; and 72 °C for I min and 20 sec. The 5' PGR primer was kinased with the sequence: 
5'-GAGAATTGACTGCTGAGGTGAAGATGAACT-3' (SEQ.E).NO.: 41) 

1 0 and the 3 ' primer contained a BamHI site with the sequence: 

5'-GGGGATGGGGGTAAGTGAGGGAGTTGAGAT-3' (SEQ.ID.NO.: 42). 
The 1.1 kb PGR fragment was digested with BamHI and cloned into EcoRV-BamHI site of 
pGMV expression vector, Nucleic acid {SEQ.ID.NO.: 43) and amino acid (SEQ.ID.NO.: 44) 
sequences for human GPR2 1 were thereafter detemiined and verified. 

15 14. GPR22 (GenBank Accession Number: U66581) 

The cDNA for human GPR22 was generated and cloned into pGMV expression 
vector as follows: PGR was performed using genomic DNA as template and rTth polymerase 
(Peridn Ehncr) with the buffer system provided by the manufacturer, 025 |iM of each primer, and 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°G for 1 min; 50°C 

20 for Imin; and 72 *G for 1 ,5 min. The 5' PGR primer was kinased with the sequence: 
5'-TGGGGGGGGAAAAAAAGGAAGTGGTGGAAA-3' (SEQ.ID.NO.: 45) 
and the 3' primer c-ontained a BanoHI site with the sequence: 
5'-TAGGATGGATTTGAATGTGGATTTGGTGAAA-3* {SEQ.ID.NO.: 46). 
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The 1.38 kb PGR fragment was digested with BamHI and cloned into EcoRV-BamHI site of 
pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 47) and amino acid (SEQJD.NO.: 48) 
sequences for human GPR22 were thereafter deterarined and verified. 

15. GPR24 (GenBank Accession Number: U71092) 

5 The cDNA for human GPR24 was generated and cloned into pCMV expression 

vector as follows: PGR was perfomied using genomic DNA as template and rTth polymerase 

(PerkinElmer)withthebuffersystemprovidedbythemanufacturer,025MMofeachprim^ 
0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94*G for 1 min; 56»C for 
Imin; and 72 »G for 1 min and 20 sec. The 5' PGR primer contains a HindlO site with the 
10 sequence: 

5'-GTGAAGGTrGCGTGTGGTGGGTGCAGGAGG-3' (SEQ.ID.NO.: 49) 
and the 3' primer contains an EcoRI site with the sequence: 

5'-GGAGAATTCGCGGTGGGGTGTTGTGGTGGCC-3' (SEQ.ID.NO.: 50). 
The 1 .3 kb PGR fiagment was digested with Hindm and EcoRI and cloned into HindlH-EcoRI 
15 site ofpGMV expression vector. The nucleic acid (SEQ.ID.NO.: 51) and amino acid sequence 
(SEQ JD.NO.: 52) for human GPR24 were thereafter detemiined and verified. 

16. GPR30 (GenBank Accession Number: U63917) 
The cDNA for human GPR30 was generated and cloned as follows: the coding 
sequence of GPR30 (1 128bp in length) was amplified ftom genomic DNA using the primers: 
20 5'-GGCGGATGGATGGATGTGAGTTGGGAA-3'(SEQ.IDm:53)and 
5'-GGCGGATCGGTACACGGCAGTGCrGAA-3' (SEQ.ID.NO.: 54). 

TheamplifiedpioductwasthencIonedintoacommereiallyavailablevector.pGR2.1(Invim^^^ 
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using a "TOPO-TA Cloning Kit" (Invitrogen, #K4500-01), following manufacturer instructions. 
The full-length GPR30 insert was liberated by digestion with BamHl, separated ftom the vector 
by agarose gel electrophoresis, and purified using a Sephaglas Banc^ji^™ Kit (Pharmacia, # 27- 
^85-01)followingmanufacturerinstnictions. The nucleic acid (SEQ.ID.NO.: 55) and amino acid 
sequence (SEQ.E).NO.: 56) for human GPR30 were thereafter deteraiined and vraified. 
1 7. GPR31 (GenBank Accession Number: U65402) 
The cDNA for human GPR3 1 was generated and cloned into pCMV expression 
vector as follows: PGR was performed using genomic DNA as template and rTth polymerase 
(Pakin Elmer) with the buffer system provided by the manufacturer, 0.25 ^iM of each primer, and 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94''C for I min; 58"C 
for Imin; and 72 °C for 2 min. The 5' PGR primer contained an EcoRI site with the sequence: 
5'-AAGGAATTCACGGCCGGGTGATGCCATTCCC-3' (SEQ.ID.NO.: 57) 
and the 3' primer contained a BamHI site with the sequence: 
5'-GGTGGATCCATAAAGACGGGCGTTGAGGAG -3' (SEQ.E).NO.: 58). 
The 1 .0 kb PGR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 
site of pGMV ejqjression vector. Nucleic add (SEQ.IDJ^^O.: 59) and amino acid (SEQ.ID.NO.: 
60) sequences for human GPR3 1 were thereafter detomined and verified 
18. GPR32 (GenBank Accession Number: AF045764) 

The cDNA for human GPR32 was generated and cloned into pGMV expression 
vector as follows: PGR was performed using genomic DNA as tCTiplate and rTth polymerase 
(Peikin Elmer) with the buffer system provided by the manufacturer, 0.25 of each prima-, and 
0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94»C for 1 min; 56°G for 
Imin; and 72 "C for 1 min and 20 sec. The 5' PGR primer contained an EcoRI site with the 
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sequence: 

5'-TAAGAATTCCATAAAAATTATGGAATGG-3' (SEQ.ID.NO.:243) 

and the 3' primer contained a BamHI site with the sequence: 

5'-CCAGGATCCAGCTGAAGTCTTCCATCATTC.3' (SEQ.ID.NO.: 244). 

The 1.1 kb PGR fragm«it was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 

site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 245) and amino acid (SEQ JD.NO.: 

246) sequences for human GPR32 were thereafter determined and verified. 

19, GPR40 (GenBank Accession Number: AF024687) 
The cDNA for human GPR40 was generated and cloned into pCMV expression 
vector as follows: PGR was performed using genomic DNA as template and rTth polymerase 
(Peridn Elmer) with the buffer system provided by the manufacturer, 0.25 jiM of each primer, and 
0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94*^0 for 1 min, 65**C for 
Imin and 72 °C for 1 min. and 10 sec. The 5' PGR primer contained an EcoRI site with the 
sequence 

5'-GCAGAATTCGGCGGCCCCATGGACGTGGCCCC-3' (SEQ.ID.NO.: 247) 

and the 3 ' primer contained a BamHI site with the sequence 

5'-GGTGGATCCGCCGAGGAGTGGCGTTACTTC-3' (SEQ.ID.NO.: 248). 

The 1 kb PGR fiagment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site 

of pGMV expression vector. Nucleic acid (SEQ.ID.NO.: 249) and amino acid (SEQ.ID.NO.: 250) 

sequences for human GPR40 were thereafter determined and verified. 

20. GPR41 (GenBank Accession Number AF024688) 

The cDNA for human GPR41 was generated and cloned into pGMV expression 

vector as follows: PGR was performed using genomic DNA as template and rTth polymerase 
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(Peridn Elmer) with the buffer system provided by the manufecturer, 0.25 ^M of each primer, and 
0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of 94<'C for 1 min, 6S°C for 
Imin and 72 °C for 1 min and 10 sec. The 5' PCR primer contained an Hindm site with the 
sequoice: 

5 5'-CTCAAGCTTACTCTCTCTCACCAGTGGCCAC-3' (SEQ.ID.NO.: 25 1) 
and the 3 ' primer was kinased with the sequence 
5'-CCCTCCTCCCCCGGAGGACCTAGC-3' (SEQ.ID.NO.: 252). 

The 1 kb PCR fiagment was digested with Hindm and cloned into Hindm-blunt site of pCMV 
expression vector. Nucleic acid (SEQ.ID.NO.: 253) and amino acid (SEQ.ID.NO.: 254) 
1 0 sequraices for human GPR41 wea^ thereafter determined and verified. 

21. GPR43 (GenBank Accession Number AF024690) 
The cDNA for human GPR43 was generated and cloned into pCMV expression 
vector as follows: PCR was performed using genomic DNA as template and rTth polymaase 
(Peridn Ehner) with the buffer s>^em provided by the manufectuier, 0.25 pM of each primer, and 
15 0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94''C for 1 min; 65''C for 
Imin; and 72 °C for 1 min and 10 sec. The 5' PCR primer contains an Hindm site with the 
sequoice: 

5'-TTTAAGCTTCCCCTCCAGGATGCTGCCGGAC-3' (SEQ.ID.NO.: 255) 
and the 3' primer contained an EcoRI site with the sequence: 
20 5'-GGCGAATTCTGAAGGTCCAGGGAAACTGCTA-3' (SEQ.ID.NO. 256). 

The 1 kbPCRfi^gmentwas digested withHindmandEcoRI and cloned into Hindm-EcoRI site 
of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 257) and amino acid (SEQ.ID.NO.: 258) 
sequoices for human GPR43 were thereafter detemiined and verified 
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22. APJ (GenBank Accession Numben U03642) 

Human APJ cDNA (in pRcCMV vector) was provided by Brian O'Dowd 
(University of Toronto). The human APJ cDNA was excised fiom the pRcCMV vector as an 
EcoRI-Xbal (blunted) fiagment and was subcloned into EcoRI-Smal site of pCMV vector. 
5 Nucleic acid (SEQ.ID.NO.: 61) and amino add (SEQ.ID.NO.: 62) sequences for human APJ 
were thereafter detomined and verified. 

23. BLRI (GenBank Accession Number: X68149) 

The CDNA for human BLRI was generated and cloned into pCMV expression 
vector as follows: PCR was performed using thymus cDNA as template and rTth polymery 
10 (P^Elmer)withthebufrersystempr«videdbythemanufectUTer,0.25MMof^^ 

0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94»C for 1 min; 62»C 
for Imin; and 72 -C for 1 min and 20 sec. The 5' PCR primer contained an EcoRI site with the 
sequence: 

5'-TGAGAATrCTGGTGACTCACAGCCGGCACAG-3' (SEQ.1D.N0.: 63): 
15 and the 3' primer contained a BamHI site with the sequence: 

5'-GCCGGATCCAAGGAAAAGCAGCAATAAAAGG-3' (SEQ.ID.NO.: 64). Tlie 1.2kbPCR 
fiagment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site of pCMV 
expression vector. Nucleic acid (SEQ.ID.NO.: 65) and amino acid (SEQJD.NO.: 66) sequences 
for human BLRI were thereafter detennined and verified. 

24. CEPR (GenBank Accession Number: U77827) 
The CDNA for human CEPR was generated and cloned into pCMV expression 
vector as follows: PCR was performed using genomic DNA as template and rTth polymerase 
(PericinEhner)withthebufrersystemprovidedbythemanufecturer,0.25MMofeach^^^ 
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0^ mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 6S'^C 
for Imin; and 72 "C for 1 min and 20 sec. The 5 ' PGR primer was kinased with the sequence: 
5'-CAAAGCTTGAAAGCTGCACGGTGCAGAGAC-3"(SEQ.ID.NO.:67) 
and the 3' prima- contained a BamHI site with the sequence: 

5 5'-GCGGATCCCGAGTCACACCCTGGCTGGGCC-3' (SEQ.ID.NO.: 68). 

The 1.2 kb PGR fragment was digested with BamHI and cloned into EcoRV-BamHI site of 
pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 69) and amino acid (SEQ.E).NO.: 70) 
sequOTces for human CEPR were thereafter detamined and verified. 

25. EBIl (GenBank Accession Number: L31581) 

10 The cDNA for human EBIl was generated and cloned into pGMV expression 

vector as follows: PGR was performed using thymus cDNA as template and rTth polymerase 
(Peridn Elmer) with the buffer ^tan provided by the manufeourer, 0.25 ^M of each primer, and 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94"'C for 1 min; 62°C 
for Imin; and 72 "C for 1 min and 20 sec. The 5' PGR primer contained an EcoRI site with the 

15 sequence: 

5'-AGAGAATTCGTGTGTGGTTTTACCGGGCAG-3' (SEQ.ID.NO.: 71) 
and the 3' primer contained a BamHI site with the sequoice: 
5'-GTGGGATGGAGGCAGAAGAGTGGGGTATGG-3' (SEQ.IDJ^O.: 72). 
The 1.2 kb PGR fiagmrait was digested witii EcoRI and BamHI and cloned into EcoRI-BamHI 
20 site of PGMV expression vector. Nucleic acid (SEQ.ID.NO.: 73) and amino add (SEQ.ID.NO.: 
74) sequences for human EBIl were thereafter determined and vaified. 

26. EBI2 (GenBank Accession Numben L08177) 

The cDNA for human EBI2 was generated and cloned into pGMV expression 
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vectorasfollows:PCRwasperfonnedi,singcDNAclone(gradouslypro^^^ 

Univei^tyofViigiiuaHealthSciencesCententhevectorutili^ 

astemplateandpfupo]3miei^(Stratagene)withthebuffersys^^ 

supplemented with lO'/o DMSO. 0.25 of each primer, and 0.5 mM of each of d,e 4 
5 nucleotides. The cycle condition was 30 cycles of: 940C for 1 min; eO-C for Imin; and 72<»C for 
lminand20sec. The 5' PGR primer contained an EcpRI site with the sequence: 
5'-CTGGAATrCACCTGGACCACCACCAATGGATA-3' (SEQ.ID.NO.: 75) 
and the 3' primer contained a Bamm site with the sequence 
5'-CTCGGATCCTGCAAAGnTGTCATACAG 17-3' (SEQ.IDm: 76). 
10 The 1.2 kb PGR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 
site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 77) and amino acid (SEQ.ID.NO. : 
78) sequences for human EBI2 were thereafter detennined and verified. 

27. ETBR-LP2 (GenBank Accession Number: D38449) 
The cDNA for human ETBR-LP2 was generated and cloned into pCMV 
15 expression vector as follows: PGR was performed using brain cDNA as template and rTth 
polymerase(PerkinElmer)withthebufiFersystempmvidedbythemanufa^ 
primer, and 0.2 mM of each of the 4 nucleotides. Hie cycle condition was 30 cycles of: 940C for 
1 min; eS^C for Imin; and 72 for 1.5 min. The 5' PGR contained an EcoRI site with the 
sequence: 

20 S'-GTGGAATrcrGGTGGTGATGGAGGCATGCGG -3' (SEQ.IDm: 79) 
and the 3' primer contained a BamHI site with the sequence: 

5'-GCrGGATCCCGAGGGGTAGTGGGGCCTGAG-3' (SEQ.ID.NO.: 80). 
Thel.5kbPGRfragmentwasdigestedwithEcoRIandBamHIandclonedintoEcoRI-BamHl 
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site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 81) and amino acid 
(SEQ.ID.NO.: 82) sequences for human ETBR-LP2 were thereafter determined and verified. 
28. GHSR (GenBank Accession Number: U60179) 

The cDNA for human GHSR was generated and cloned into pCMV expression 
. 5 vector as follows: PGR was performed using hippocampus cDNA as template and TaqPlus 
Precision polymerase (Stratagene) with the buffer system provided by the manufacturer, 0.25 
of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 
94**C for 1 min; 68°C for Imin; and 72 *>C for 1 min and 10 sec. For firet round PGR, the 5' PGR 
primer sequence was: 
10 5'-ATGTGGAACGGGAGGCGGAGGG-3' (SEQ.E>.NO.: 83) 
and the 3' primer sequence was: 

5'-TGATGTATTAATAGTAGATTGT-3* (SEQ.ID.NO.: 84). 

Two microhters of the first round PGR was used as template for the second round PGR where the 
5 ' primer was Idnased with sequence: 
15 5'-TAGGATGTGGAAGGGGACGCGCAGCGAAGAGCCGGGGT-3'(SEQ,ID.NO.:85) 
and the 3' primer contained an EcoRI site with the sequence: 

5'-GGGAATTCATGTATTAATACTAGATTGTGTGCAGGCCCG-3XSEQ.ID.NO.:86). 
The 1.1 kb PGR fiagment was digested with EcoRI and cloned into blunt-EcoRI site of pCMV 
expression vector. Nucleic acid (SEQ.ID.NO.: 87) and amino acid (SEQ.ID.NO.: 88) sequences 
20 for human GHSR were thereafter determined and verified. 

29. GPCR-CNS (GenBank Accession Number: AF017262) 
The cDNA for human GPCR-CNS was generated and cloned into pCMV 
expression vector as follows: PGR was performed using brain cDNA as template and rTth 
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polymerase(PeikinElmer)withthebufiFersystempmvidedbythe^^^ 

primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94-0 for 

1 ink; eS-C for imin; and 720C for 2„m The5' PCRprimer contained aHmdm 
sequence: 

5 5'-GCAAGCTTGTGCCCTCACCAAGCCATGCGAGCC-3' (SEQ.IDJ^O.: 89) 
and the 3' primer contained an EcoRI site with the sequence. 

5'-CC3GAATTCAGCAATGAGTTCCGACAGAAGG-3' (SEQ.ID.Na: 90). 
Hie 1.9 kb PGR fragment was digested with Hindm and EcoRI and cloned into HindlH-EcoRI 
siteofpCMV expression vector. All nine clones sequenced containedapotentialpol^^ 
10 im.oIvingaS284Cchange. Aside from this difference, nucleic acid (SEQ.ID.NO.: 91) and amino 
acid(SEQ.m.NO.:92)sequencesforhumanGPCR-CNSwerethereafterdetemnnedandva^^^^ 
30. GPR-NGA (GenBank Accession Number: U55312) 

The cDNA for human GPR-NGA was generated and cloned into pCMV 
expression vector as follows: PGR was perfomied using genomic DNA as template and rTth 
15 P°i)™erase(PerkinElmer)withthebuflFersystemprovidedbythemanufacto^^ 

primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of 94-C for 

lnnn.56-C for lminand72<G for 1.5 min. The5-PCRprimer contained anEcoRI site with the 
sequence: 

5'-CAGAATTCAGAGAAAAAAAGTGAATATGGTnTT-3' (SEQ.ID.NO.: 93) 
20 and the 3 ' primer contained a BamHI site with the sequence: 

5'-TrGGATCCCTGGTGCATAACAATTGAAAGAAT-3' (SEQJD.NO.: 94). 

The 1.3 kb PGR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI 

site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 95) and amino acid (SEQ.ID.Na: 
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96) sequences for human GPR-NGA were ther^ifter determined and verified. 
31. H9 (GenBank Accession Number: U52219) 

The cDNA for human HB954 was generated and cloned into pCMV expression 
vector as follows: PCR was performed using pituitary cDNA as template and rTth polymerase 
5 (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 ^M of each primer, and 
0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94'*C for 1 min, 62**C for 
Imin and 72 for 2 min. The 5' PCR primer contains a HindlQ site with the sequence: 
5'-GGAAAGCTTAACGATCCCCAGGAGCAACAT-3' (SEQ.ID.NO.: 97) 
and the 3' primer contains a BamHI site with the sequence: 

10 5'-CTGGGATCCTACGAGAGCATTTTTCACACAG-3' (SEQ.ID.NO.: 98). 

The 1.9 kb PCR fragment was digested with Hindlll and BamHI and cloned into Hindlll- 
BamHI site of pCMV expression vector. When compared to the published sequences, a 
different isoform with 12 bp in frame insertion in the cytoplasmic tail was also identified and 
designated "H9b." Both isoforms contain two potential polymorphisms involving changes 

15 of amino acid P320S and amino acid G448A. Isoform H9a contained another potential 
polymorphism of amino acid S493N, while isoform H9b contained two additional potential 
polymorphisms involving changes of amino acid I502T and amino acid A532T 
(corresponding to amino acid 528 of isoform H9a). Nucleic acid (SEQ.ID.NO.: 99) and 
amino acid (SEQ.ID.NO.: 100) sequences for human H9 were thereafter determined and 

20 verified (in the section below, both isoforms were mutated in accordance with the Human 
GPCR Proline Marker Algorithm). 

32. HB954 (GenBank Accession Number: D38449) 

The cDNA for human HB954 was generated and cloned into pCMV expression 
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vcctorasfolIowsrPOlwasperfonnedusingbraincDNAastemplatean^ 

Elmer)withthebuflFersystempn)videdbythemanufacturer,025HMof^^ 

ofeachofthe4nucleotides.ll,ecycleconditio„was30cyclesof94'Cforl^ 
and 72 oC for 2 min. The 5' PCR contained a Hindm site with the sequence: 

5 5'-TCCAAGCTrCGCCATGGGACATAACGGGAGCr -3' (SEQ.ID.NO.: 101) 
and the 3' primer contained an EcoRI site with the sequence: 
5'-CGTGAATTCCAAGAATrTACAATCCTTGCT -3' (SEQ.ID.NO.: 102): 
The 1.6 kb PGR fiagment was digested with HindlH and EcoRI and cloned into Hindin- 
EcoRJ site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 103) and amino acid 
10 (SEQ.ID.no.: 104)sequencesforhumanHB954werethereafterdetenninedand verified. 
33. HG38 (GenBank Accession Number: AF062006) 
The cDNA for human HG38 was generated and cloned into pCMV expression 
vectorasfollows:PCRwasperfom,edusingbraincDNAastemplateandrTthpolymerasea^^ 

Elmer)withthebufiFer system pn)videdbythemanufectui^,025MMofeachprimer.and02^ 
15 of each 4 nucleotides. The cycle condition was 30 cycles of 940C for 1 min, 56'C for Imin and 
72''C for 1 min and 30 sec. Two PGR rations were perfomied to separately obtain the 5' and 
3' fiagment. For the 5' fragment, the 5' PGR primer contained an Hindm site with the sequence: 
5'-CCCAAGCTrCGGGCACCATGGACACCTCCC-3' (SEQ.ID.NO.: 259) 
and the 3 ' primer contained a BamHIsite with the sequence: 

20 5'-ACAGGATCCAAATGCACAGCACTGGTAAGC-3' (SEQ.ID.NO.: 260). 

This5' 1.5kbPCRfiagmentwasdigestedwithHindmandBammand cloned into an Hindffl- 

BamHIsiteofpCMV. For the3' fragment. the5'PCR primer was kinased with the sequence: 
5'-CTATAACrGGGTTACATGGnTAAC-3' (SEQJD.NO. 261) 
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and the 3 ' primer contained an EcoRI site with the sequence: 

5*.TTTGAATTCACATATTAATTAGAGACATGG-3' (SEQ.ID.NO.: 262). 

The 1.4 kb 3' PCR fragment was digested with EcoRI and subcloned into a blunt-EcoRI site of 

pOVIV vector. The5'and3'fragmentswerc then ligated together throughacommonEcoR^ 
5 to generate the fiill length cDNA clone. Nucleic acid (SEQ.ID.NO.: 263) and amino add 
(SEQ.ID.no.: 264) sequences for human HG38 were thereafter determined and verified. 

34. HM74 (GenBank Accession Number: D10923) 
The cDNA for human HM74 was generated and cloned into pCMV expression 
vector as follows: PCR was perforaied using either genomic DNA or thymus cDNA (pooled) as 
10 template and rTth polymerase (Perkin Elmer) with the buffer system provided by the 
manufacturer, 0.25 of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle 
condition was 30 cycles of: 94°C for 1 min; eS-C for Imin; and 72 °C for 1 min and 20 sec. TTie 
5* PGR primer contained an EcoRI site with the sequence: 
5'-GGAGAATTCACTAGGCGAGGCGCTCCATC-3' (SEQ.ID.NO.: 105) 
15 and the 3 'primer was kinased with the sequmce: 

5'-GGAGGATCCAGGAAACCTTAGGCCGAGTCC-3' (SEQ.ID.NO.: 106). 
The 1.3 kb PCR fi:agment was digested with EcoRI and cloned into EcoRI-Smal site of 
pCMV expression vector. Clones sequenced revealed a potential polymorphism involving a 
N94K change. Aside from this difference, nucleic acid (SEQ.ID.NO.: 107) and amino acid 
20 (SEQ.ID.NO.: 108) sequences for human HM74 were thereafter determined and verified. 

35. MIG (GenBank Accession Numbers: AFO44600 and AFCM4601) 
The cDNA for human MIG was generated and cloned into pCMV expression 
vector as follows: PCR was performed using genomic DNA as template and TaqPlus Precision 
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polymerase(St.atagene)forfirstK>undPCRorpfupol>.ne:Bse(S^^^^ 
with the buffer system provided by the manufacturer, 0.25 of each primer, and 0.2 mM 
(TaqPlus Precision) or 0.5 mM (pfo) of each of the 4 nucleotides. When pfu was used, 10% 
DMSO was included in the buffer. The cycle condition was 30 cycles of: 940C for 1 min; 65»C 
5 for imin; and 72 <>C for: (a) 1 min for first n>und PGR; and (b) 2 min for second round PGR. 
Because there is an intron in the coding region, two sets of primer, wei^ separately used to 
genera overlapping 5' and 3' fiagments. The 5' fegment PCRprimei. were: • 

5'-ACCATGGCTTGCAATGGCAGTGCGGCCAGGGGGCACT-3' (external sense) 
(SEQ.ID.NO.: 109) 

10 and 

5'-CGACCAGGACAAACAGCATCrrGGTCACTTGTCTCCGGC-3'(intemal antisense) 
(SEQ.ID.NO.:110). 

The 3' fi^agment PGR primas were: 

5'-GACCAAGATGGTGrrTGTCGTGGTGGtGGTGTTTGGCAT-3' (internal sense) 
15 (SEQ.ID.NO.:lll)and 

5'-GGGAATrCAGGATGGATGGGTCTGTrGGTGGGCCr-3' (external antisense with an 
EcoRI site) (SEQ.ID.NO.: 1 12). 

The 5' and 3- fragments were ligated together by using the first round PGR as template and the 
kiiiasedextemalsenseprimerandextemalantisenseprimertoperformsecondro^^ The 
20 1.2 kb PGR fi:agment was digested with EcoRI and cloned into the blunt-EcoRI site of pGMV 
expression vector. Nucleic acid (SEQ.JDm.: 113) and amino acid (SEQ.ID.NO.: 114) 
sequoices for human NflG were thereafter determined and verified. 

36. OGRl (GenBank Accession Number: U48405) 
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The cDNA for human OGRl was generated and cloned into pCMV expression 
vector as follows: PCR was performed using genomic DNA as template and rTth polymerase 
(Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 ^iM of each primer, and 
0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 65°C 
5 for Imin; and 72 for 1 min and 20 sec. The 5' PCR primer was kinased with the sequence: 
5'-GGAAGCTTCAGGCCCAAAGATGGGGAACAT-3' (SEQ.ID.no.: 115): 
and the 3' primer contained a BamHI site with the sequence: 

5'-GTGGATCCACCCGCGGAGGACCCAGGCTAG -3' (SEQ.ID.NO.: 1 16). 
The 1 . 1 kb PCR fragment was digested with BamHI and cloned into the EcoRV-BamHI site 
10 of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 1 17) and amino acid (SEQ.ID.NO.: 
1 1 8) sequences for human OGRl were thereafter determined and verified. 
37. Serotonin SHTja 

The cDN A encoding endogenous human 5 HTja receptor was obtained by RT-PCR 
using human brain poly-A* RNA; a 5' primer from the 5' untranslated region with an Xho I 

15 restriction site: 

5'-GACCTCGAGTCCTTCTACACCTCATC-3' (SEQ.ID.NO: 1 19) 

and a 3' primer from the 3' untranslated region containing an Xba I site: 

5'-TGCTCTAGATTCCAGATAGGTGAAAACTTG-3*(SEQ.ID.NO: 120) 

PCR was performed using either TaqPlus™ precision polymerase (Stratagene) or rTth™ 

20 polymerase (Perkin Elmer) with the buffer system provided by the manufacturers, 0.25 nM of each 
primer, and 0.2 niM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 
1 min; 57 for Imin; and 72 «C for 2 min. The 1 .5 kb PCR fragment was digested with Xba I 
and subcloned into Eco RV-Xba I site of pBluescript. The resulting cDNA clones were fully 
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sequenced and found to encode two amino acid changes fiom the pubhshed sequences. The first 
one was a T25N mutation in the N-temiinal extracellular domain; the second is an H452Y 
mutation. Because cDNA clones derived fiom two independent PCR reactions using Taq 
polymerase fiom two diflFerent commercial sources (TaqPlus™ from Stratagene and rWMperidn 
5 Elmer) contained the same two mutations, these mutations are likely to represent sequence 
polymorphisms rather than PCR errors. With these exceptions, the nucleic acid (SEQ.ID.NO.: 

121)and amino acid(SEQJDJ^O.:122)sequences for human were thereafter deto^ 
and verified. 



38. Serotonin SHTjc 



Th« '^^NA encoding endogenous human SHT^c receptor was obtained fiom 
human brain poly-A* RNA by RT-PCR. The 5' and 3' primers were derived from the 5' and 3' 
untranslated regions and contained the following sequences: 
5'-GACCTCGAGGTTGCTTAAGACTGAAGC-3' (SEQ.ID.NO.: 123) 
5'-ATTTCTAGACATATGTAGCTTGTACCG-3' (SEQ.ID.NO.: 124) 
15 Nucleicacid(SEQ.ID.NO.: 125) and amino acid (SEQ.E).NO.: 126) sequences for human 5HT,c 
were thereafter determined and verified. 

39. V28 (GenBank Accession Number: U20350) 

The cDNA for human V28 was generated and cloned into pCMV expression 
vector as follows: PGR wasperformed usingbrain cDNA as template and rTth polymerase (Petkin 
20 Ehner)withthebuffersystemprovidedbythemanufacturer,0.25 ^Mofeach primer, and 02 mM 
of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94''C for 1 min; 65»C for Imin; 
and 72 »C for 1 min and 20 sec. The 5' PCR primer contained a Hindm site with the sequence: 
5'-GGTAAGCTTGGCAGTCCACGCCAGGCCTTC-3' (SEQ.ID.no.: 127) 
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and the 3' primer contained an EcoRI site with the sequence: 
5'-TCCGAATTCTCTGTAGACACAAGGCTTTGG-3' (SEQ.ID.NO.: 128) 
The 1.1 kb PGR fragmeni was digested with Hindm and EcoRI and cloned into HindlH-EcoRI 
siteofpCMV expression vector. Nucleic acid (SEQ.ID.NO.: 129) and amino acid (SEQ.ID.NO.: 
5 1 30) sequences for human V28 were thereafter determined and verified. 



Example 2 

Preparation of Non-Endogenous Human GPCRs 
L Site-Directed Mutagenesis 

Mutagenesisbased upon the Human GPCR Proline Marker 2q>pn)ach disclosed herein was 
10 performed on the foregoing endogenous human GPCRs using Transformer Site-Directed 
Mutagenesis Kit (Clontech) according to the manufacturer instmctions. For this mutagenesis 
approach, a Mutation Probe and a Selection Marker Probe (unless otherwise indicated, the probe 
of SEQ.ID.no.: 132 was the same throughout) were utilized, and the sequences of these for the 
specified sequences are listed below in Table B (the parenthetical number is the SEQ. ID.NO.). 
1 5 For convenience, the codon mutation incorporated into the human GPCR is also noted, in standard 
form: 



Table B 





Receptor Identifier 
(Codon Mutation) 


Mutation Probe Sequence 
(5'.3') 
(SEQ,ID.NO.) 


Selection Marker Probe 
Sequence (5'-3') 

(seq.ib.no.) 


2( 


iGPRl 
(F245K) 


GATCTCCAGTAGGCATAAGT 

GGACAATTCTCG 

(131) 


CTCCrrCGGTCCTCCTATCGT 

TGTCAGAAG 

(132) 




GPR4 
(K223A) 


AGAAGGCCAAGATCGCGCGG 

CTGGCCCrCA 

(133) 


CTCCITCGGTCCTCCTATCGT 
TGTCAGAAGT 


2i 


GPRS 

(V224K) 


CGGCGCCACCGCACGAAAAA 
GCTCATCTTC 


CTCCTTCGGTCCrCCTATCGT 
TGTCAGAAGT 
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(134) 



GPR7 
(T250K) 



30GPR9 
(M254K) 



GPR9.6 
(U41K) 



3t 



GPRIO 

(F276K) 



GPR15 
(I240K) 



GCCAAGAAGCGGGTGAAGTT 
CCTGGTGGTGGCA 
(135) 



CrCCrrCGGTCCTCCTATCGT 
TGTCAGAAGT 




CGGCGCCTGCGGGCCMGCG 
GCTGGTGGTGGTG 
(137) 



CCAAGCACAAAGCCAAGAAA 

GTGACCATCAC 

(138) 



CTCCrrcGGTCCrcCTATCGT 
TGTCAGAAGT 



CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 



GCGCCGGCGCACCAAATGCT 

TGCTGGTGGT 

(139) 



CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 



CAAAAAGCTGAAGAAATCTA 

AGAAGATCATCTTTATTGTCG 
(140) 



CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 



GPR17 
(V234K) 



40 GPR18 
(I231K) 



GPR20 
(M240K) 



CAAGACCAAGGCAAAACGCA 

TGATCG(XAT 

(141) 



CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 



GTCAAGGAGAAGTCCAAAAG 

GATCATCATC 

(142) 



CGCCGCGTGCGGGCCAAGCA 

GCTCCTGCTC 

(143) 



CTCCTTCGGTCCTCCTATCCT 
TGTCAGAAGT 



CTCCTTCGGTCCTCCTATCGT 
TGTCAGAACT 



4i 



GPR21 
(A251K) 



CCTGATAAGCGCTA TAAAA T 

GGTCCTGTITCGA 

(144) 



CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 



5(1 



GPR22 
(F312K) 



GPR24 
(T304K) 



GPR30 
(L258K) 
GPR31 



(Q221K) 



GAAAGACAAAAGAGAGTCA 

AGAGGATGTCnTATTG 

(145) 



CGGAGAAAGAGGOTGAAAC 

GCACAGCCATCGCC 

(146) 



alternate approach; see below 



AAGCTTCAGCGGGCCAAGGC 

ACTGCTCACC 

(147) 



CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 



CTCCrrCGGTCCTCCTATCGT 
TCTCAGAAGT 



alternate ^proach; see below 



CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 



5i 



GPR32 
(K255A) 



GPR40 
(A223K) 



GPR41 



CATGCCAACCGGCCCGCGAG 

GCTCjCTGCrGGT 

(279) 



CGGAAGCTGCGGGCCAAATG 

GGTGCjCCGGC 

(265) 



accac<:agcagcctcgcggg 

CCGGTTGGCATG 
(280) 



CTCCrrCGGTCCTCCTATCGT 
TGTCAGAAGT 



CAGAGGAGGGTGAAGGGGCT 
GTTGGCG 



CTCCrrCGGTCCTCCTATCGT 
TGTCAGAAGT 
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(A223K) 


(266) 




(V221K) 


UOCUOCuCCO AOCC AAGGGG 

CTGGCTGTGG 

(267) 


CTCcriTCGGTCCTCCTATCGT 
TGTCAGAAGT 


APJ 

f (U47K) 


alternate approach; see below 


alternate approach; see below 


BLRl 

(V258K) 


CAGCGGCAGAAGGCAAAAA 

GGGTGGCCATC 

(148) 


CTCu ri CGGTCCTCCTATCGT 
TGTCAGAAGT 


CEPR 
(L258K) 


CGGCAGAAGGCGAAGCGCAT 

GATCCTCGCG 

(149) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 



1 

L 

2( 
2f 

3( 


lEBIl 
(I262K) 


GAGCGCAACAAGGCCAAAA 

AGGTGATCATC 

(150) 


CTCCri CGGTCCTCCTATCGT 
TGTCAGAAGT 


EBI2 
(U43K) 


GGTGTAAACAAAAAGGCIAA 

AAACACAATTATTCTTATT 

(151) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


ETBR-LP2 
! (N358K) 


GAGAGCCAGCTCAAGAGCAC 

CGTGGTG 

(152) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


GHSR 
(V262K) 


CCACAAGCAAACCMQAAAA 

TGCTGGCTGT 

(153) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


GPCR.CNS 
(N491K) 


CTAGAGAGTCAGATGAAGTG 

TACAGTAGTGGCAC 

(155) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


> GPR-NGA 
(I275K) 


CGGACAAAAGTGAAAACTAA 

AAAGATGTTCCrCATT 

(156) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


H9aandH9b 
(F236K) 


GCTGAGGTTCGCAATAAACT 

AACCATGTTTGTG 

(157) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


HB954 
(H265K) 


GGGAGGCCGAGCTGAAAGCC 

ACCCTGCTC 

(158) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


HG38 

(V765K) 


GGGACTGCTCTATGAAAAAA 

CACATTGCCCTG 

(268) 


CATCAAGTGTATCATGTGCC 

AAGTACGOCC 

(154) 




HM74 
(I230K) 


CAAGATCAAGAGAGCCAAAA 

CCTTCATCATG 

(159) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 


MIG 
(T273K) 


CCGGAGACAAGTGAAGAAG 

ATGCrGTTTGTC 

(160) 


CTCCTTCGGTCCTCCTATCGT 
TGTCAGAAGT 




OGRl 
(Q227K) 


GCAAGGACCAGATCAAGCGG 

CTGGTGCrCA 

(161) 


CTCCITCCjGTCCTCCTATCGT 
TGTCAGAAGT 




Serotonin SHTza 
(C322K) 


alternate approach; see below 


alternate approach; see below 




Serotonin SHTjc 
(S310K) 


alternate approach; see below 


alternate approach; see below 
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V28 
(I230K) 


CAAGAAAGCCAAAGCCAAG 


CTCCrrcXKjTCCTCCTATCGT 


AAACTGATCCTTCTG 


TGTCAGAAGT 




(162) 



Thenon-endogenous human GPCRs were then sequenced and the derived and verifiednuclei 
acid and amino acid sequences are listed in the accompanying "Sequence Listing" appendix 

5 to this patent document, as summarized in Table C below: 

Table C 



Mutated GPCR 


Nucleic Acid Sequence 
Listing 


Amino Acid Sequence 
Listing 


GPRl 
(F245K) 


SEQ.ID.no.: 163 


SEQ.ID.no.: 164 


10 GPR4 
(K223A) 


SEQ.1D.no.: 165 


SEQ.©.N0.: 166 


GPR5 
(V224K) 


SEQ.ID.no.: 167 


SEQ.ID.NO.: 168 


GPR7 
If (T250K) 


SEQ.n).NO.: 169 


SEQ.ID.no.: 170 


GPRS 

(T259K) 


SEQ.ID.no.: 171 


SEQ.ID.NO.: 172 


GPR9 
(M254K) 


SEQ.ID.no.: 173 


SEQ.IDJ^O.: 174 


>(iGPR9-6 
(L24IK) 


SEQ.ID.no.: 175 


SEQ.ID.NO.: 176 


GPRIO 
(F276K) 


SEQ.ID.no.: 177 


SEQJD.NO.: 178 


GPR15 

.i (I240K) 


SEQ.ID.no.: 179 


SEQ.ID.NO.: 180 


GPR17 
(V234K) 


SEQ.IDJSfO.: 181 


SEQ.ID.no.: 182 


GPRl 8 

(123 IK) 


SEQ.ID.no.: 183 


SEQ.ID.N0.: 184 


{IGPR20 
(M240K) 


SEQJD.NO.: 185 


SEQJD.no.: 186 


GPR21 

(A251K) 


SEQ.ID.no.: 187 


SEQ.ID.no.: 188 


GPR22 
! (F312K) 


SEQ.ID.no.: 189 


SEQ.1D.no.: 190 


GPR24 
(T304K)) 


SEQ.ID.no.: 191 


SEQ.ID.no.: 192 


GPR30 


SEQJD.no.: 193 


SEQ.ID.no.: 194 
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(U58K) 








GPR31 


SEQ.ID.no.: 195 


SEQ.ID.no.: 196 




GPR32 
! (K255A) 


SEQ.ID.no.: 269 


SEQ.ID.NO.: 270 




GPR40 

(A223I9^ 


SEQ.ID.NO.: 271 


SEQ.ID.no.: 272 


1( 


GPR41 
(A223K) 


SEQ.ID.NO.: 273 


SEQ.IDJ40.:274 


> GPR43 . 
(V221K) 


SEQ.IDJSrO.: 275 


SEQ.ID.no.: 276 




APJ 

(L247K) 


SEQ.ro.Na: 197 


SEQ.ID>IO.: 198 


v. 


BLRl 
CV258K) 


SEQ.ID.NO.; 199 


SEQ.ID.NO.: 200 




CEPR 
(L258K) 


SEQJD.NO.: 201 


SEQ.ID.no.: 202 


2( 


EBIl 
(I262K) 


SEQ.ID.NO.: 203 


SEQ.ID.no.: 204 


EBI2 
(L243K) 


SEQ.ID.no.: 205 


SEQ.IDJMO.: 206 




EnjR-LP2 
(N358K) 


SEQ.ID.no.: 207 


SEQ.ID.no.: 208 


2: 


GHSR 
(V262K) 


SEQ.ID.no.: 209 


SEQJD.no.: 210 




GPCR-CNS 
(N491K) 


SEQ.ID.NO.: 211 


SEQJDJ^O.: 212 


3( 


GPR-NGA 

{I275K) 


SEQ.ID.no.: 213 


SEQ.ID.no.: 214 


1 H9a 
(F236K) 


SEQ.ID.NO.: 215 


SEQ.1D.no.: 216 




H9b 
(F236K) 


SEQJD.no.: 217 


SEQ.ID.no.: 218 




HB954 


SEQ.ID.no.: 219 


SEQJD.NO.: 220 




HG38 

(V765K) 


SEQ.ID.no.: 277 


SEQ.ID.no.: 278 


4( 


HM74 
(I230K) 


SEQ.lD.NO.: 221 


SEQ.ID.NO.: 222 


NOG 
(T273K) 


SEQJDm: 223 


SEQJD.no.: 224 




OGRl 
(Q227K) 


SEQ.ID.NO.: 225 


SEQ.ID.NO.: 226 




Serotonin SHTja 

(C322K) 


SEQ.ID.NO.: 227 


SEQ.ID.no.: 228 




Serotonin SHTjc 
(S310K) 


SEQ.E)J^O.: 229 


SEQ.IDJSfO.: 230 




V28 
(I230K) 


SEQ.ID.NO.: 231 


SEQ.ID.no.: 232 
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2. Alternate Mutation Approaches for Employment of the Proline Marker 
Algorithm: APJ; Serotonin SHT^^; Serotonin SHT^c; and GPR30 

Although the above site-directed mutagenesis approach is particularly prefened, other 
approaches can be utilized to create such mutations; those skiUed in the art are readily credited 
5 with selecting ^roaches to mutating a GPCR that fits within the particular needs of the artisan. 
a. APJ 

Preparation of die non-endogenous, human APJ receptor was accompUshed by 
mutating L247K. Two oligonucleotides containing this mutation were synthesized: 
5'- GGCTTAAGAGCATCATCGTGGTGCTGGTG-3' (SEQ.ID.NO.: 233 ) 
10 5'-GTCACCACCAGCACCACGATGATGCTCTTAAGCC-3'(SEQ.ID.Na:234) 

The two oligonucleotides were annealed and used to replace theNael-BstEHfiagmentofhuman, 
endogenous APJ to generate the non-endogenous, vereion of human APJ. 

b. Serotonin SHT^^ 
cDNA containing the point mutation C322K was constructed by utilizing the r^triction 
15 enzymesiteSphIwhichencompassesaminoacid322. A primer containing the C322K mutation: 
5'-CAAAGAAAGTACTGGGCATCGTCTTCTTCCT-3' (SEQ.ID.NO: 235) 
was used along with the primer fiom the 3' untranslated region of the receptor 
5'-TGCTCTAGATTCCAGATAGGTGAAAA CTTG-3' (SEQ.ID.NO.: 236) 
to perfomi PGR (under the conditions described above). The resulting PGR fragment was then 
20 used to replace the 3' end of endogenous 5HT^ cDNA through the T4 polymerase blunted Sph 
I site. 

c Serotonin SHTjc 

ThecDNAcontainingaS310KmutationwascomtmctedbyreplacingtheStyIiestriction 
fragment containing amino acid 310 with synthetic double stranded ohgonucleotides that encode 
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the desired mutation. The sense strand sequence utilized had the followmg sequence: 
5'.CTAGGGGCACCATGCAGGCTATCAACAATGAAAGAAAAGCTAAGAAAGTC-3' 
(SEQ. ID.NO.: 237) 

and the antisense strand sequence utilized had the following sequence: 
5 5'-CAAGGACITTCITAGCIT^ 
ID.NO.: 238) 

d. GPR30 

Prior to generating non-endogenous GPR30, several independentpCR2. 1/GPR30 isolates 
were sequenced in their entirety in order to identify clones with no PCR-generated mutations. A 

1 0 clone having no mutations was digested with EcoRl and the endogenous GPR30 cDNA fragment 
was transferred into the CMV-driven expression plasmid pCI-neo (Promega), by digesting pCI- 
Neo with EcoRI and subcloning the EcoRI-liberated GPR30 fragment from pCR2.1/GPR30, to 
generate pCI/GPR30. Thereafter, the leucine at codon 25 8 was mutated to a lysine using a Quick- 
Change^ Site-Directed Mutagenesis Kit (Stratagene, #200518), according to manufacturer's 

1 5 instructions, and the following primers: 

5'-CGGCGGCAGAAGGCGAAACGCATGATCCTCGCGGT-3' (SEQ.ID.NO.: 239) and 
S'-ACCGCGAGGATCATGCGTTTCGCCTTCTGC CGCCG-3' (SEQ.ID.NO.: 240) 
Example 3 

Receptor (Endogenous and Mutated) Expression 

20 

Although a variety of cells are available to the art for the expression of proteins, it is most 
preferred that mammalian cells be utilized. The primary reason for this is predicated upon 
practicalities, Le. , urilization of, e.g., yeast cells for the expression of a GPCR, while possible, 
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inmxiuc. i„,o p..<«, a „o„-™an„„al,an cl, «hich no, i„ U,c ca. of 

)-s,do«„„,,tacl„deter«^,or.couplh^,««,,„^^3^3^3^^^^^^ 
have«,oKcdforma,™«U»^.a,^,^^„^^j^^_^_^^^_^_^^^_^ 

of po«mial „s=, a,. „o, as p„rfcn«l as obutoed to „,am„,alian cells. Of fl.e 
5 — ™«,.s.COS.7,293a«,2MT«Us3rep.„ic„,a.,yprrf..,ed.a.«„.^tt..p^, 
n™iia„ oell udlized can be prcdicaled upon particular needs of to ardsan. 

Unless ott-erwise noted herein. U,e felloudng p^tocol was utilized fer fte expression 

ofteendogenonsamlnon^e«>„shuma„GPCRs.TableDlisBtemanunalia„cella„d 
number utilized (per ISOnun plate) for GPCR expression. 

Table D 



15 



20 



Receptor Name 
(Endogenous or Non- 
Endogenous) 


Mammalian Cell 
(Number Utilized) 


GPR17 


293 (2 X 10^) 


GPR30 


293(4xl(r») 


APJ 


cos-7 (5X10*) ; 


KrBR-LP2 


293 (1 X 10') 
293T(1 X 10') 


GHSR 


293(1 x100 
293T(1 X 10') 


MIG 


293 (1 X 10') 


Sootonin 5HT2A 


293T(1 X 10') 


Sfflotonin SWTjc 


2931(1x10') 



On day one. mammalian ceUs were plated out On day two. 



two reaction tubes were 



prepared (the proportions to foUow for each tube 



are per plate): tube A was prepared by mixing 



20Hg DNA ie.g., pCMV vector. pCMV v«.or with endogenous receptor cDNA, and pCMV 
25 vector with non-endogenous receptor cDNA) in 1.2ml serum fiee DMEM (Irvin, Scientific. 
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Irvine, CA); tube B was prepared by mixing 120^1 lipofectamine (Gibco BRL) in 1.2ml serum 
free DMEM. Tubes A and B were then admixed by inversions (several times), followed by 
incubation at room temperature for 30-45min. The admixture is inferred to as the "transfection 
mixture". Plated cells were washed with IXPBS, followed by addition of IQml serum free 
5 DMEM. 2.4ml of the transfection mixture was then added to the cells, followed by incubation 
for 4hrs at 37°C/5% CO2. The transfection mixture was then removed by aspiration, followed by 
the addition of 25ml of DMEM/1 0% Fetal Bovine Serum. Cells were then incubated at 37°C/5% 
CO2. After 72hr incubation, cells were then harvested and utilized for analysis. 
1. Gi-Coupled Receptors: Co-Transfection with Gs-Cpupled Receptors 
10 In the case of GPR30, it has been determined that this receptor couples the G protein Gi. 

Gi is known to inhibit the enzyme adenylyl cyclase, which is necessary for catalyzing the 
conversion of ATP to cAMP. Thus, a non-endogenous, constitutively activated fomi of GPR30 
would be expected to be associated with decreased levels of cAMP. Assay confirmation of a non- 
endogenous, constitutively activated form of GPR30 directly via measurement of decreasing levels 
15 of cAMP, while viable, can be preferably measured by cooperative use of a Gs-coupled receptor. 
For example, a . receptor that is Gs-coupled will stimulate adenylyl cyclase, and thus will be 
associated with an increase in cAMP. The assignee of the present ^plication has discovered that 
the orphan receptor GPR6 is an endogenous, constitutively activated GPCR. GPR6 couples to the 
Gs protein. Thus when co-transfected, one can readily verify that a putative GPRBO-mutation 
20 leads to constitutive activation thereof: Le., an endogenous, constitutively activated 
GPR6/endogenous, non-constitutively activated GPR30 cell will evidence an elevated level of 
cAMP when compared with an endogenous, constitutively active GPR6/non-endogenous, 
constimtively activated GPR30 (the latter evidencing a comparatively lower level of cAMP). 
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Assays that detect CAMP can beutilized to deterniineifacandidm^ 

agonist toaGs-associatedreceptor(/.e.suchacompoundwoulddeci^e the I^^^^^ 

a Gi-associated receptor (or a Go-associated ,«eptor) (Le.. such a candidate compound would 

increase the levelsofcAMP).Avariely of ^proacheslaiovvn in the art form^^ 
5 beutilizediaprefeiTedqjproachreliesupontheuseofanti-cAMPantibodies. Another approach, 
and most preferred, utilizes a whole cell second messenger reporter system assay. Promote,, on 

genes drive the expression of the proteim diat a particdar gene encodes. CycUc AMP 
expressioribypn,motingthebindingofacAMP.|^r«iveDNAbindingprotei^ 
fector (CREB) which then binds to the promoter at specific sites called cAMP response elements 
10 anddrivestheexpressionofthegene. Rq^orter systems can be consmicted which have a promoter 
containing multiple cAMP r^nse elements before the r^orter gene. e.g.. P-galactosidase or 
luciferase.Thus,anactivatedi«:q,torsuchasGPR6causestheaccumdationofcA^ 
activates the gene and expression of the reporter protein. Most preferably, 293 cells are co 

. ^fcctedwithGPR6(oranotherGs-linkedreceptor)andGPI«0(oranotherGi-link^ 
15 plasmids. preferably in. a 1:1 ratio, most preferably in a 1:4 ratio. Because GPR6 is an 

endogenous. constimtivelyactivereceptorthatstimulatesthepKKiuctionofcAA^^ 
activates the reporter gene and its expression. The reporter protein such as P-galactosidase or 
luciferase can then be detected using standard biochemical assays (Chen et al. 1995). Co- 
»^fectionofendogenous,constitutivelyactiveGPR6withendogenous.non-co^ 
20 GPR30 evidences an increase in the luciferase reporter protein. Conversely, co-transfection of 
endogenous, constitutively active GPR6 with non-endogenous, constitutively active GPR30 

evidencesadrasticdecreaseinexpressionofluciferase. Several reporterplasmidsareta^^ 
available in the art for measuring a second messenger assay. It is considered well within the 



wo 00/22129 



PCT/US99/23938 



-60- 

skilled artisan to determine an appropriate reporter plasmid for a particular gene expression based 
primarily upon the particular need of the artisan. .Although a variety of cells are available for 
expression, mammalian cells are most prefenred, and of these types, 293 cells are most preferred. 
293 ceils were transfected with the reporter plasmid pCRE-Luc/GPR6 and non-endogenous, 
5 constitutively activated GPR30 using a Mammalian Transfection^"^ Kit (Stratagene, #200285) 
CaP04 precipitation protocol according to the manufacturer's instructions {see, 28 Genomics 347 
(1 995) for the published endogenous GPR6 sequence). The precipitate contained 400ng reporter, 
80ng CMV-expression plasmid (having a 1 :4 GPR6 to endogenous GPR30 or non-endogenous 
GPR30 ratio) and 20ng CMV-SEAP (a transfection control plasmid encoding secreted alkaline 
10 phosphatase). 50% of the precipitate was split into 3 wells of a 96-well tissue culture dish 
(containing 4X10* cells/well); the remaining 50% was discarded. The following morning, the 
media was changed. 48 hr aflCT the start of the transfection, cells were lysed and examined for 
luciferase activity using a LucIiteTw Kit (Packard, Cat. # 601691 1) and Trilux 1450 Microbeta^M 
liquid scintillation and luminescence counter (Wallac) as per the vendor's instructions. The data 
1 5 were analyzed using GraphPad Prism 2.0a (GnqjhPad Software Inc.). * 

With respect to GPRl 7, which has also been detemMned to be Gi-linked, a modification 
of the foregoing approach was utilized, based upon, inter alia, use of another Gs-linked 
endogenous recqjtor, GPR3 (see 23 Genomics 609 (1994) and 24 Genomics 391 (1994)). Most 
preferably, 293 cells are utilized. These cells were plated-out on 96 well plates at a density of 2 
20 X 1 0* cells per well and were transfected using Lipofectamine Reagent (BRL) the following day 
according to manufacturer instructions. A DNA/lipid mixture was prepared for each 6-well 
transfection as follows: 260ng of plasmid DNA in 100|il of DMEM were gently mixed with 2^1 
of lipid in 100^1 of DMEM (the 260ng of plasmid DNA consisted of 200ng of a 8xCRE-Luc 
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i^orterplasinid(5eebelow).50ngofpCMVcomprisingendogenousreceptor 
receptor or pCMV alone, and lOng of a GPRS expression plasmid (GPRS in pcDNA3 
(Ihvitrogen)). The 8XCRE-Luc reporterplasmid wasprepared as follows: vector SRIF-p-gal was 
obtained by cloning the rat somatostatin promoter (-71/+51) at BglV-Hindm site in the ppgal- 
5 Basic Vector(Clontech). Eight (8) copies ofcAMP response element were obtained by PGR fiom 
an adenovirus template AdpCF126CCRE8(iee7Human Gene Therapy 1883 (1996)) and cloned 
into theSRIF-P-galvectorattheKpn-BglV site, resultingin the 8xCRE-P-galrepoitervec^^^ 
8xCRE-Luc reporter plasmid was generated by replacing the beta-galactosidase gene in the 
8xCRE-p-gal reporter vector with the luciferase gene obtained flora the pGU-basic vector 
10 (Promega) at the HindlH-BamHI site. FoUowing 30min. incubation at room temperature, the 

DNA/UpidmixturewasdUutedwith400^1ofDMEMandlOOMlofthedilutedmixturewasadded 
to each well. 1 00 ^l of DMEM with 1 0% PCS were added to each well after a 4hr incubation in 
a cell culnire incubator. The next morning the transfected cells were changed with 200 ^1/well of 
DMEM with 10% PCS. Eight(8)houislater,thewellswerechangedtol00Ml/wellofDMEM . 
15 without phenol red, after one wash with PBS. Luciferase activity were measured the next day 

usingtheLucLite^reportergeneassay kit (Packard) followingmanufecturcrinstmctionsa^^ 
on a 1450 MicroBeta™ scintillation and luminescence counter (Wallac). 

Figure 4 evidences that constitutively active GPR30 inhibits GPR6-mediated 

activation of CRE-Luc reporter in 293 cells. Luciferase was measured at about 4.1 relative 

20 light units in the expression vector pCMV. Endogenous GPR30 expressed luciferase at about 

8.5 relative light units, whereas the non-endogenous, constitutively active GPR30 (L258K), 

expressed luciferase at about 3.8 and 3. 1 relative light units, respectively. Co-tiansfection of 

endogenous GPR6 with endogenous GPR30, at a 1 :4 ratio, drastically increased luciferase 
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expression to about 104.1 relative light units. Co-transfection of endogenous GPR6 with non- 
endogenous GPR30 (L258K), at the same ratio, drastically decreased the expression, which 
is evident at about 18.2 and 29.5 relative light units, respectively. Similar results were 
observed with respect to GPR17 with respect to co-transfection with GPRS, as set forth in 
5 Figure 5. 
Example 3 

Assays For determination of Constitutive Activity 
OF Non-Endogenous GPCRs 

A. Membrane Binding Assays 

10 1. psjGTpyS Assay 

When a G protein-coupled receptor is in its active state, either as a result of ligand binding 
or constitutive activation, the receptor couples to a G protein and stimulates the release of GDP 
and subsequent binding of GTP to the G protein. The alpha subunit of the G protein-receptor 
complex acts as a GTPase and slowly hydrolyzes the GTP to GDP, at which point the receptor 

15 normally is deactivated, Constitutively activated receptors continue to exchange GDP for GTP. 
The non-hydrolyzable GTP analog, [-^^SjGTPyS, can be utilized to demonstrate enhanced binding 
of pS]GTPyS to membranes expressing constitutively activated receptors. The advantage of 
using [-^^SlGTPyS binding to measure constitutive activation is that: (a) it is generically applicable 
to all G protein-coupled receptors; (b) it is proximal at the membrane surface making it less likely 

20 to pick-up molecules which affect the intracellular cascade. 

The assay utilizes the ability of G protein coupled receptors to stimulate pSJGTPyS 
binding to membranes expressing the relevant receptors. The assay can, therefore, be used in 
the direct identification method to screen candidate compotmds to known, orphan and 
constitutively activated G protein-coupled receptors. The assay is generic and has application 
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to dnxg discovery at aU G protein-coupled receptors. 

TTie pS]GTPyS assay was incubated in 20mMHEPES and between 1 and about 20niMMgCl, 
(this amount can be adjusted for optimization of results, although 20mM is preferred) pH 7.4, 

bindingbufFerwithbetween aboutO.3 and about lJ2nMP'S]GTPyS(tWs amounts 
5 for optimization ofresults, although 1.2 is prefened) and 12.5 to 75 membrane protein (e^^ 
COS.7cellsexpressingther«ceptor,thisamountcanbeadjustedforoptimization,al&^^ 
is prefeired) and 1 nM GDP (this amount can be changed for optimization) for 1 hour. 
Whea^emi agglutinin beads (25 txl; Amersham) were then added and die mixture was incubated 
for another 30 minutes at room temperature. ITie tubes were then centrifiiged at 1500 x g for 5 
10 minutes at room temperature and then counted in a scintillation counter. 

A less costly but equally applicable alternative has been identified which also meets the 

needs oflarge scale screening. Flash plates™andWallac™scintistiipsmay be utilized 

a high throughput pS]GTPyS binding assay. Furthermore, using this technique, the assay can be 

utilized for known GPCRs to simultaneouslymonitortritiatedUgandbindingto the re^^^ 
15 sametimeasmonitoringtheefficacy via[«S]GTPySbinding.ThisispossiblebecausetheWalh^ 
betacounter can switch enei^r windows to look at both tritium and ^'S-labeledprobes.™^ 
may also be used to detect other types of membrane activation events resulting in receptor 
activation. For example, the assay may be used to monitor ^^P phosphorylation of a variety of 
receptors (both G protein coupled and tyrosine kinase receptors). When the membranes are 
20 centrifuged to the bottom of the well, the bound [^^SJGlPyS or the «P-phosphorylated receptor 
will activate the scintillant which is coated of the wells. Scinti* strips (Wallac) have been used to 
demonstrate this principle. In addition, the assay also has utility for measuring ligand binding to 
receptors using radioactively labeled ligands. In a similar mamier, when the radiolabeled bound 
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ligand is centrifiiged to the bottom of the well, the scintistrip label comes into proximity with the 
radiolabeled ligand resulting in activation and detection. 

Representative results of graph comparing Control (pCMV), Endogenous AP J and Non- 
Endogenous APJ, based upon the foregoing protocol, are set forth in Figure 6. 
5 2. Adenylyl Cyclase 

A Flash PlateTM Adenylyl Cyclase kit (New England Nuclear, Cat. No. SMP004A) 
designed for cell-based assays was modified for use with crude plasma membranes. The Flash 
Plate wells contain a scintillant coating which also contains a specific antibody recognizing c AMP. 
The cAMP generated in the wells was quantitated by a direct competition for binding of 
1 0 radioactive cAMP tracer to the cAMP antibody. The following serves as a brief protocol for the 
measurement of changes in cAMP levels in membranes that express the receptors. 

Transfected cells were harvested z^roximately three days after transfectioa Membranes 
were prepared by homogenizatipn of suspended cells in buffer containing 20mM HEPES, pH 7.4 
and lOmM MgCl2. Horaogenization was performed on ice using a Brinkman Polytron™ for 
15 approximately 10 seconds. The resulting homogenate was centrifuged at 49,000 X g for 15 
minutes at 4°C. The resulting pellet was then resuspended in buffer containing 20mM HEPES, 
pH 7.4 and O. 1 mM EDTA, homogenized for 1 0 seconds, followed by centrifugation at 49,000 X 
g for 15 minutes at 4°C. The resulting pellet can be stored at -80°C until utilized. On the day of 
measurement, the membrane pellet was slowly thawed at room temperature, resuspended in buffer 
20 containing 20mM HEPES, pH 7.4 and IQmM MgCL2(these amounts can be optimized, although 
the values listed herein are prefereed), to yield a final protein concentration of 0.60mg/ml (the 
resuspended membranes were placed on ice until use). 

cAMP standards and Detection BuffCT (comprising 2 nCi of tracer [^^^I cAMP ( 1 00 jil] to 
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1 1 ml Detection Buffer) were prepared and maintained in accordance with the manufecturer's 
inslructiom. Assay Bufferwaspreparedfreshforscreeningandcontained20ra^ 
lOmM MgCl,. 20mM (Sigma), O.I units/ml creatine phosphokinase (Sigma), 50 nM GTP 
(Sigma), and 02 mM ATP (Sigma); Assay Buffer can be stored on ice until utilized. ITie assay 
5 was initiated by addition of 50ul of assay buffer followed by addition of 50ul of membrane 
suspension to the MEN Flash Plate. Hie resultant assay mixture is incubated for 60 minutes at 

room temperanire followed by additionoflOOulofdetection buffer. Plates are then incubated an 
additional2-4hoursfollowedbycountinginaWallacMicn>BetascintiUationcounter. Valuesof 
cAMP/well are extrapolated fiom a standard cAMP curve which is contained witiiin each assay 
10 plate. The foregoing assay was utilized with respect to analysis of MG. 
B. Reporter-Based Assays 

1. CREB Reporter Assay (Gs-associated receptors) 
Amethod to detect Gs stimulation depends on the known property of the transcription 
factor CREB, which is activated in a cAMP-dependent manner. A PathDetect CREB trans- 
15 Reporting System (Stratagene, Catalogue # 219010) was utilized to assay for Gs coupled 
activity in 293 or 293T cells. Cells were transfected with the plasmids components of this 
above system and the indicated expression plasmid encoding endogenous or mutant receptor 
using a Mammalian Transfection Kit (Stratagene, Catalogue #200285) according to the 
manufacurer's instructions. Briefly, 400 ngpFR-Luc(Iuciferase reporter plasmid containing 
20 Gal4 recognition sequences), 40 ng pFA2-CREB (Gal4-CREB fusion protein containing the 
Gal4 DNA-binding domain), 80 ng CMV-receptor expression plasmid (comprising the 

receptor) and 20ngCMV-SEAP(secreted alkaline phosphatase expressionplasmid; alkaline 
phosphatase activity is measured in the media of transfected cells to control for variations in 
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transfection efficiency between samples) were combined in a calcium phosphate precipitate 
as per the Kit's instructions. Half of the precipitate was equally distributed over 3 wells in a 
96-well plate, kept on the cells overnight, and replaced with fresh medium the following 
moming. Forty-eight (48) hr after the start of the transfection, cells were treated and assayed 
5 for luciferase activity as set forth with resepct to the GPR30 system, above. This assay was 
used with respect to GHSR. 

2- API reporter assay (Gq-associated receptors) 
Ae method to detect Gq stimulation depends on the known property of Gq-dependent 
phospholipase C to cause the activation of genes containing API elements in their promoter. 
10 A Pathdetect AP-1 cis-Reporting System (Stratagene, Catalogue # 219073) was utilized 
following the protocl set forth above with respect to the CREB reporter assay, except that the 
components of the calcium phosphate precipitate were 410 ng pAPl-Luc, 80 ng receptor 
expression plasmid, and 20 ng CMV-SEAP. This assay was used with respect to ETBR-LP2 

C. Intracellular IP3 Accumulation Assay 
15 On day 1, cells comprising the serotonin receptors (endogenous and mutated) were 

plated onto 24 well plates, usually 1x10^ cells/well. On day 2 cells were transfected by firstly 

mixing 0.25ug DNA in 50 ul serumfree DMEM/well and 2 ul lipofectamine in 50 jil 

serumfree DMEM/welL The solutions were gently mixed and incubated for 15-30 min at 

room temperature. Cells were washed with 0.5 ml PBS and 400 |il of serum free media was 

20 mixed with the transfection media and added to the cells. The cells were then incubated for 

3-4 hrs at 37''C/5%C02 and then the transfection media was removed and replaced with 

Iml/well of regular growth media. On day 3 the cells were labeled with ^H-myo-inositol. 

Briefly, the media was removed the cells were washed with 0.5 ml PBS. Then 0.5 ml inositol- 

free/serumfree media ( GIBCO BRL) was added/well with 0.25 ^Ci of ^H-myo-inositol / well 
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and the cells were incubated for 16-18 hrs o/n at STW/oCO, . On Day 4 the cells were 
washed with 0.5 ml PBS and 0.45 ml of assay medium was added containing inositol- 
ftee/serum free media 10 jxM pargyline 10 mM lithium chloride or 0.4 ml of assay medium 
and 50 ul of lOx ketanserin (ket) to final concentration of 10^^. The cells were then 
5 incubated for 30 min at 37-C. The cells were then washed with 0.5 ml PBSand 200 ul of 
ftesh/icecold stop solution (IM KOH; 18 mM Na-borate; 3.8 mM EDTA) was added/well. 
The solution was kept on ice for 5-10 min or until cells were lysed and then neutralized by 
200 m of fresh/ice cold neutralization sol. (7.5 % HCL). The lysate was then transfeired into 
1 .5 ml eppendorf tubes and 1 ml of chloroform/methanol (1 :2) was added/tube. The solution 
10 wasvortexed for 15 sec and the upper phase was applied to a Bioiad AG1-X8 anion 
exchange resin(100-200 mesh). Firstly, the resin was washed with water at 1:1.25 WA^ and 
0.9 ml ofupper phase was loaded onto the column. The column was washed with lOmlsof 
5 mM myo-inositol and 10 ml of 5 mM Na-borate/60mM Na-fomiate. The inositol tris 
phosphates were eluted into scintillation vials containing 10 ml of scintillation cocktail witii 
15 2 ml of 0.1 M fonnic acid/ 1 M ammonium formate. The columns were regenerated by 
washing with 1 0 ml of 0. 1 M formic acid/3M ammonium formate and rinsed twice with dd 
HjO and stored at 4»C in water. 

Figure 7 provides an illustration of IP3 production from the human S-HT^a receptor 

that incorporates the C322Kmutation.WMletheseresuItsevidence that the ProlineMuto^^^^ 
20 Algorithm approach constitutively activates this receptor, for purposes of using such a 
receptor for screening for identification of potential therapeutics, a more robust difference 

wouldbeprefeiTed.However,becausetheactivatedreceptorcanbeutili2edforunderstanding 
and elucidatingtheroleofconstitutive activation and for the identificationofcompounds that 
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can be further examined, we believe that this difference is itself useful in differentiating 
between the endogenous and non-endogenous versions of the human SHTja receptor. 
D. Result Summary 

The results for the GPCRs tested are set forth in Table E where the Per-Cent Increase 
indicates the percentage difference in results observed for the non-endogenous GPCR as compared 
to the endogenous GPCR; these values are followed by parenthetical indications as to the type of 
assay utilized. Additionally, the assay sytem utilized is parenthetically listed (and, in cases where 
different Host Cells waie used, bodi are listed). As these results indicate, a variety of assays can 
be utilized to detemiine constitutive activity of the non-endogenous versions of the human GPCRs. 
Those skilled in the art, based upon the foregoing and with reference to infomiation available to 
the art, are creditied with theability to selelect and/ot maximize a particular assay approach that 
suites the particuah: needs of theinvestigator. 



Table E 



Receptor Identifier 


Per-Cent Difference 


(Codon Mutation) 




GPR17 


74.5 


(V234K) 


(CRE-Luc) 


GPR30 


71,6 


(L258K) 


(CREB) 


APJ 


49.0 


(L247K) 


(GTPj«) 


ETBR-LP2 


48.4(AP1-Luc - 293) 


(N358K) 


61.1(AP1-Luc-293T) 



GHSR 


58.9(CREB - 293) 


(V262K) 


35.6(CREB-293T) 
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5 



MIG 
(I230K) 


39(cAMP) 


Serotonin SHTja 
(C322K) 


33.2 (IP3) 


Serotonin SHTjc 
(S310K) 


39.1(IP3) 



Example 6 

Tissue Distribution of Endogenous Orphan GPCRs 

UsingaconimerdaUyavailablehuman-tissuedot-blotfomiat,endogenousoiphanG 
10 wereprobedforadetenninationoftheareaswheresuchreceptoreareloc^ Except as indicate 
below, the entire receptor cDNA (radiolabelledj was used as the probe: radiolabeled probe was 
generated using the complete receptor cDNA (excised &om the vector) using a Prime-It II™ 

Random Primer I^elingKit(Stratagene.#300385),according to manufacturer'sinstmctio 
A human RNA Master Blot™ (Clontech, #7770-1) was hybridized with the GPCR 
15 radiolabeled probe and washed under stringent conditions according manufacturer's 
instructions. The blot was exposed to Kodak BioMax Autoradiography fibn overnight at - 
80»C. 

Representative dot-blot fonnat results are presented in Figure 8 for GPRl (8A), GPR30 
(8B), and APJ (8CX with results being summarized for all receptors in Table F 



Table F 



GPCR 


Tissue Distribution 
(h^hest levels, relative to other tissues in 
the dot-blot) 


GPRl 


Placenta, Ovary, Adrenal 
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GPR4 


Broad* hiehest in Heart. T^iina Adrenal 
Thyroid Soinal Cord 


GPR5 


Placenta, Thymus, Fetal Thymus 
Lesser levels in soleen fetal snleen 


GPR7 


Liver Snleen. Soinal Cord. Placenta 


GPRS 


No expression detected 


GPR9-6 


Thymus Fetal Thvmus 
Lesser levels in Small Intestine 


GPR18 


Spleen, Lymph Node, Fetal Spleen, Testis 


GPR20 


Broad 


GPR21 


Rnoad* verv low ahimdatire 


GPR22 


Heart Fetal Heart 

XAWCUlk, X wWU 1.1.VCUI 

Lesser levels in Brain 


GPR'^0 


Stomach 

lJ Vv/lllOwll 


GPR31 


Broad 






PFPR 


Stninjirh T ivpr TTiviy^iiI PiitainMi 


EBIl 


Pancreas 

T-fis^er lev^K in LvmnHnid Tiqqiip^ 

i-»wOdwi iw^io lii Xi/jriii|^ii\Ji\x X lOoUwo 


EBI2 


Lymphoid Tissues, Aorta, Lung, Spinal Cord 




T^niad* Rrsiin Tiosiip 


GPCR-CNS 


Brain 

T.e5?<?er level*; in Te^ti*; Plarenta 




Pitiiitarv 
X iLuiixujr 

Lesser levels in Brain 


H9 


Pituitary 


HB954 


Aorta, Cerebellum 

Lesser levels in most other tissues 


HM74 


Spleen, Leukocytes, Bone marrow. Mammary 
Glands, Lung, Trachea 


MIG 


Low levels in Kidney, Liver, Pancreas, Lung, 
Spleen 


ORGl 


Pituitary, Stomach, Placenta 


V28 


Brain, Spleen, Peripheral Leukocytes 



25 Based upon the foregoing information, it is noted that human GPCRs can also be assessed 

for distribution in diseased tissue; comparative assessments between "normal" and diseased tissue 
can th^ be utilized to determine the potential for over-expression or under*expression of a 
particular receptor in a diseased state. In those circumstances where it is desirable to utilize the 
non-endogenous versions of the human GPCRs for the puipose of screening to directly identify 
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candidatecon,poundsofpote„ti^the,apeutic.lev^^^ 

in the ti^tment of diseases and disonleis where a particular human GPCR is over-expressed. 
Whereas agonists orpartiai agonists areuseiulin the treatrnent of dise.^ 
particular human GPCR is under-expressed. 
5 As desired, more detailed, cellular localization of the recepotrs, using techniques weD- 

known to those in the'art (e.g.. in-situ hybridization) can be utilized to identify particualr cells 

within these tissues where the receptor of interest is expressed. 

Itisintendedthateachofthepatents.apphcations,andprintedpubhcati^^ 
this patent document be hereby incorporated by reference in their entirety. 
10 As those sMUed in the art will appreciate, num^us changes and modifications may be 

made to the preferred embodiments of the invention widrout departing fiom t^^ 
invention. It is intended that all such variations fell within the scope of the invention. 

Although a variety of expression vectors are available to those in the art, for purposes of 
utihzationforboththeendogenousandnon-endogenoushumanGPCRs,itismostprefe^ 
15 ^-e^-utilizedbepCMV.TWsvectorhasbeendepositedwiththeAmeri^^ 

Collection (ATCC) on October 13. 1998 (10801 University Blvd., Manassas, VA 201 10-2209 
USA)undertheprovsionsoftheBudapestTreatyfortheIntemationalRecoghi^^^ 

ofMicnwrganisms for the Purpose of patentProcedure.ll,e vector was testedbytheATCC on 

,1998anddeterminedtobeviableon .1998. The ATCC has assigned 

20 the following deposit number to pCMV:^ 
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CLAIMS 

What is claimed is: 

1 - A constitutively active, non-endogenous version of an endogenous human orphan G protein- 
coupled receptor (GPCR) comprising the following amino acid residues (carboxy-terminus to amino- 
teiminus orientation) transversing the transmembrane-6 (TM6) and intracellular loop-3 (IC3) regions 
of the non-endogenous GPCR: 

P' AA,5 X 

wherein: 

(1) P* is an amino acid residue located within the TM6 region of the non- 

endogenous GPCR, where P' is selected from the group consisting 
of (i) the endogenous orphan GPCR proline residue, and (ii) a non- 
endogenous amino acid residue other than proline; 

(2) AAjjare 1 5 amino acid residues selected from the group consisting 
of (a) the 15 endogenous amino acid residues of the endogenous 
orphan GPCR, (b) 15 non-endogenous amino acid residues, and (c) 
a combination of 15 amino acid residues, the combination 
comprising at least one endogenous amino acid residue of the 
endogenous orphan GPCR and at least one non-endogenous amino 
acid residue, excepting that none of the 15 endogenous amino acid 
residues that are positioned within the TM6 region of the GPCR is 
proline; and 

(2) X is a non-endogenous amino acid residue located within the IC3 region 
of said non-endogenous GPCR. 

2. The non-endogenous human GPCR of claim 1 wherein P^ is the endogenous proline 
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residue. 

3- T^enon-endogenoushumtoGPCRofclaimlwhei^inP-isanon-endogenousamino 
acid residue other than a proline residue. 

4. The non-endogenous human GPCR of claim 1 wherein AA„ are the 15 endogenous 
5 amino acid residues of the endogenous GPCR. 

5. Hie non-endogenous human GPCR of claim I wherein X is selected from the group 

consistingoflysine,hisitidine.arganine and alanine residues, excepting that when the 
endogenous amino acid in position X of said endogenous human GPCR is lysine, X 
is selected from the group consisting of histidine, argmine and alanine. 
10 6. Thenon-endogenoushumanGPCRofclaim I wherein X is a lysine residue, excepting 
that when the endogenous amino acid in position X of said endogenous human GPCR 
is lysine, X is an amino acid other than lysine. 

7. Thenon-endogenoushumanGPCRofclaim4whereinXisaIysineresidue,excepting 
that when the endogenous amino acid in position X of said endogenous human GPCR 

15 is lysine, X is an amino acid other than lysine. 

8. The non-endogenous, human GPCR of claim 1 wherein P' is a proline residue and X 
is a lysine residue, excepting that when the endogenous amino acid in position X of 
said endogenous human GPCR is lysine, X is an amino acid other than lysine. 

9. A host cell comprising the non-endogenous human GPCR of claim 1 
20 1 0. The material of claim 9 wherein said host cell is of mammalian origin. 

11. The non-endogenous human GPCR of claim 1 in a purified and isolated form. 

12. A nucleic acid sequence encoding a constitutively active, non-endogenous version of 
an endogenous human orphan G protein-coupled receptor (GPCR) comprising the following 
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nucleic acid sequence region transversing the transmembrane-6 (TM6) and intracellular loop-3 
(IC3) regions of the orphan GPCR: 

3'.p-'-(AA-codon),3Xcodo„-5' 

wherein: 

(1) pcodon jg ^ nucleic acid encoding region within the TM6 region of the 
non-endogenous GPCR, where P«^ encodes an amino acid selected 
from the group consisting of (i) the endogenous GPCR proline residue, 
and (ii) a non-endogenous amino acid residue other than proline; 

(2) (AA-codon),3 are 1 5 codons encoding 1 5 amino acid residues selected 
from the group consisting of (a) the 15 endogenous amino acid 
residues of the endogenous orphan GPCR, (b) 15 non-endogenous 
amino acid residues, and (c) a combination of 1 5 amino acid residues, 
the combination comprising at least one endogenous amino acid 
residue of the endogenous orphan GPCR and at least one non- 
endogenous amino acid residue, excepting that none of the 15 
endogenous amino acid residues that are positioned within the TM6 
region of the orphan GPCR is proline; and 

(3) Xcodon is a nucleic acid encoding region residue located within the IC3 
region of said non-endogenous human GPCR, where encodes a 
non-endogenous amino acid. 

13. The nucleic acid sequence of claim 12 wherein P"^ encodes an endogenous proline 
residue. 

14. The nucleic acid sequence of claim 12 wherein P«^" encodes a non-endogenous 



^«««^2129 PCr/US99/23938 



-75- 

amino acid residue otiier than a proline residue. 

15. The nucleic acid sequence of claim 12 wherein encodes a non-endogenous 
amino acid selected from the group consisting of lysine, histidine. arginine and 
alanine, excepting that when the endogenous amino acid in position X of said 

5 endogenous human GPCR is lysine. encodes an amino acid selected from the 

group consisiting of histidine, arginine and alanine. 

16. The nucleic acid sequence of claim 13 wherein encodes a non-endogenous 
lysine amino acid excepting that when the endogenous amino acid in position X of 
said endogenous human GPCR is lysine, X^ encodes an amino acid selected from 

10 Regroup consisiting ofhistidine, arginine and alanine. 

17. The nucleic acid sequence of claim 12 wherein X^^ is selected from the group 
consisting of AAA, AAG, GCA, GCG, GCC and GCU. 

18. The nucleic acid sequence of cl^ 12 wherein X.^„„ is selected from the group 
consisting of AAA and AAG. 

15 19. The nucleic acid sequence of claim 12 wherein P<»*» is selected from the group 
consisting of CCA, CCC, CCG and CCU, and X^„ is selected from the group 
consisting of AAA and AAG. 

20. A vector comprising the nucleic acid sequence of claim 12. 

21. A plasmid comprising the nucleic acid sequence of claim 12. 
20 22. A host cell comprising the nucleic acid sequence of claim 21 . 

23. The nucleic acid sequence of claim 12 in a purified and isolated form. 

24. A method for selecting for alteration an endogenous amino acid residue within the 
third intracellular loop of a human G protein-coupled receptor ("GPCR"). said receptor 
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comprising a transmembrane 6 region and an intracellular loop 3 region, which endogenous 
amino acid, when altered to a non-endogenous amino acid, constituti vely activates said human 
GPCR, comprising the following steps: 

(a) identifying an endogenous proline residue within the transmembrane 6 region 
of a human GPCR; 

(b) identifying, by moving in a direction of the carboxy-terminus region of said 
GPCR towards the amino-temiinus region of said GPCR, the endogenous, 1 6*^ 
amino acid residue fix>m said proline residue; 

(c) altering the endogenous residue of step (b) to a non-endogenous amino acid 
residue to create a non-endogenous version of an endogenous human GPCR; 
and 

(d) determining whether the non-endogenous human GPCR of step (c) is 
constitutively active. 

25. The method of claim 24 wherein the amino acid residue that is two residues from said 
proline residue in the transmembrane 6 region, in a carboxy-terminus to amino- 
terminus direction, is tryptophan. 

26. A constitutively active, non-endogenous human GPCR produced by the process of 
claim 24. 

27. A constitutively active, non-endogenous human GPCR produced by the process of 
claim 25. 

28. An algorithmic s^proach for creating a non-endogenous, constitutively active version 
of an endogenous human G protein coupled receptor (GPCR), said endogenous GPCR 
comprising a transmembrane 6 region and an intracellular loop 3 region, the 
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algorithmic approach comprising the steps of: 

(a) selecting an endogenous human GPCR comprising a proline residue in the 
transmembiane-6 region; 

(b) identifying, by counting 16 amino acid residues from the proline residue of 

5 ^*«P a carboxy-tenninus to amino-terminus direction, an endogenous 

amino acid residue; 

(c) altering the identified amino acid residue of step (b) to a non-endogenous 
amino acid residue to create a non-endogenous version of the endogenous 
human GPCR; and 

10 (d) determining ifthe non-endogenous version ofthe endogenous human GPCR 

of step (c) is constitutively active. 

29. The algorithmic approach of claim 28 wherein the amino acid residue that is two 
residues from said proline residue in the transmembrane 6 region, in a carboxy- 
terminus to amino-terminus direction, is tiyptophan. 

15 30. A constitutively active, non-endogenous human GPCR produced by the algorithmic 
approach of claim 28. 

31. A constitutively active, non-endogenous human GPCR produced by the algorithmic 
approach of claim 29. 

32. A method for directly identifying a compound selected from the group consisting of 
inverse agonists, agonists and partial agonists to a non-endogenous, constitutively 
activated human G protein coupled receptor, said receptor comprising a 
transmembrane-6 region and an intracellular loop-3 region, comprising the steps of: 
(a) selecting an endogenous human GPCR; 



20 
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(b) identifying a proline residue within the transmembrane-6 region of the GPCR 
of step (a); 

(c) identifying, in a carboxy-terminus to amino-terminus direction, the 
endogenous, 16* amino acid residue from the proline residue of step (b); 

5 (d) altering the endogenous amino acid of step (c) to a non-endogenous amino 

acid; 

(e) confirming that the non-endogenous GPCR of step (d) is constitutively active; 

(f) contacting a candidate compound with the non-endogenous, constitutively- 
activated GPCR of step (e); and 

fe) detenmining, by measurement of the compound eflBcacy at said contacted 
receptor, whether said compound is an inverse agonist, agonist or partial 
agonist of said receptor. 

33. The metiiodofclaim 32 wherein die non-endogenous amino acid ofstep(d) is lysine. 

34. A compound directly identified by the metiiod of claim 32. 

15 35. The method of claim 32 wherein the directly identified compound is an inverse 
agonist. 

36. The metiiod of claim 32 wherein tiie directly identified compound is an agonist- 

37. The metiiod of claim 32 wherein tiie directiy identified compound is a partial agonist. 

38. A composition comprising the inverse agonist of claim 35. 
20 39. A composition comprising the agonist of claim 36. 

40. A composition comprising tiie partial agonist of claim 37. 

41. A metiiod for directly identifying an inverse agonist to a non-endogenous. 
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constitutivelyactivatedhumanGpreteincoupledreceptor("GPC^^^^ 
a transniembrane-6 region and an intracellular loop-3 region, comprising the steps of: 

(a) selecting an endogenous human GPCR; 

(b) identifying a proline residue within the transmembrane-6 region of the GPCR of 
step (a); 

(c) identifying, in a carboxy-terminus to amino-tenninus direction, the 
endogenous,16* amino acid residue from the proline residue of step (b); 

. (d) ^teringtheendogenousaminoacidofstep(c)toanon-endogenouslysineresidue; 

(e) confinning that the non-endogenous GPCR of step (d) is constitutively active; 

(f) contacting a candidate compound with the non-endogenous, constitutively- 
activated GPCR of step (e); and 

(g) detenmning.bymeasurementofthecompoundefficacyatsaidcontactedreceptor, 
whether said compound is an inverse agonist of said receptor. 

42. An inverse agonist directly identified by the method of claim 37. 
15 43. A composition comprising an inverse agonist of claim 38. 



10 
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Pst I 
Ava I 
Nci I 



EcoR V 
Ipnd in I £cj>R I 



Nci I BsrB I 

Sma I Not I 

panjiH I^pe ipja I I^ae m 



Sac n 

l^stX I ^ac I 



AACCTTCATATCGAATTCCTGCAGCCCGCCGG ATCCACTACTTCTACAGCCCCCCCCACCCCCGTGCACCTCCACCTTTT 



H 1 1 1 1 1 1 1 h 



TTCGAACTATAGCTTAAGGACGTCGGGCCCCCTAGOTGATCAAGATCTCCCCGGCGGTGGCGCCACCTCGAGGTCCAAAA 

K ^, K K ^« ^ S R A A A T A V E L Q L L 

S L I S H S C S P G D L L E R P P P R W S S S F 
Q A Y R I P A A RG IH - F • SGRHRCGAPAF 

I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 

"•i ^1 K ^, K^r. ^r, K '^^ ^ ^ ^ * A A V A T S S W S K 

K \ ^.r^. ^n^n^, T R S R G C G R H L E L K Q 

AQYRIGAARPIW N LPRIRPPAGAK 

Bss^ n 



■4 80 



GTTCCCTTTACTGACGCTTAATTGCGCCCTACAGCATCTTTCTCAAGGAACCTTACTTCT 

CAAGGGAAATCACTCCCAATTAACGCGCGATCTCCTAGAAACACTTCCTTGGAATGAAGACACCACACTGTATTAACCTG 

r P L V R V H C A L E D C E G T L L L W C D I 1 G 
^x, ^T, r> ' r, ^ ' . " ^ ' F V K E P Y F C C V T L D 

.V ? y .S E. S ; L R A R G S L • R N L T S V V • H N W T 

I 1 1 1 1 1 1 1 1— — I 1 1 1 1 1 1 ^ 

''u ^» *o S S S R„ 9. S P V K S R H H S U 1 P C 

ER'HPNIAR-LIKTFSG KQPTVYNS 
T G K L S P • N R A L P 0 K H L F R V E^ T H C L Q V 



160 



I 



AAACTACCTACAGAGATTTAAAGCTCTAAGCTA AATATAAAATTTTTAAGTGTATAATGTGTTAAACTACTGATTCTAAT 
• 1 ' 1 1 1 1 I I I 1 1 1 1 1 1 y 240 

TTTGATGGATGTCTCTAAATTTCGAGATTCCATTTATATTTTAAAAATTCACATATTACACAATTTGATGACTAAGATTA 

"^x "^r, ^, ^ ^ ^ G K Y K I F K C IMC T T D S H 

HYLQ R F KALR I NF VYNVLNY F- 
I 1 1 1 1 -I 1 \ 1 1 1 1 1 1 1 1 ^ 

, \, K ^, ^ ^ S •. P L Y L I K L H I I H V V S E L 

F RCLMLARLYIYFK TYLTNF QN M 



TGTTTGTGTATTTTAGATTCCAACCTATCCAACTCATGAATCGGA GCACTGGTCGAATGCCTTTAATGAGGAAAACCTGT 

1 I ' 1 1 I I I f- 1 1 1 I I I I I 320 

ACAAACACATAAAATCTAAGGTTGGATACCTTGACTACTTACCCTCGTCACCACCTTACGGAAATTACTCCTTTTGGACA 
CLCILDSKLWN* •UCAVV ECl r.KPV 

. \ ^ % •» ^ ^ ^„ c "t d e w ""e q w w h a f n e e m l 

L F V Y F R F Q P M BLIIHGSSGGUPLM R K T C 

y — I 1 1 1 I 1 1 + — — ( 1 { 1 1 1 1 h 

QKH IKSELRHFQRIPATTSHR - hpfrt 

\ S \ \ L 'm \ ^ •, % VS. S S. c/h Vf "a \ L ^ % %\ 
NTYKLNWGISSIPPLLPPIGKILFVQ 
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TTTGCTCACAACAAATCCCATCTAGTCATG ATGACCCTACTGCTCACTCTCAACATTCTACTCCTCCAAA AAACAACACA 
AAACGAGTCTTCTTTACGCTAGATCACTACTACTCCGATGACGACTCAGAGTTCTAAGAWAGGTTm^ 400 



Styj I 

AACCTACAACACCCCAACGACTTTCCTTCAGA ATTGCTAAGTTTmCAGTCATGCTCTCTm^ 
TTCCATCTTCTGGGCTTCCTGAAAGCAAGTCTTAACCATTCAAAAAACTCAGTACGACACAAATm^^^ 

F % S ''S "^G \ % ^ ^ ^ ' A I- N K S D h' Q i . 'y y' r i Q ' 
L \ % ^ L % \ \ ^ K \ S K K K„ L • A T N L L L V A 
»«'I.VKB pq . TKQTySHKTI S S K S 



TTCCTTTCCTATTTACACCACAAAGGAAAAAGCT CCACTCCTATACAAGAAAATTATGCAAAAATATTCTG TAACCTTTA 
AACGAAACGATAAATGTGGTGTTTCCTTTTTCGACCTGACCATATGTTCTTTTAATACCTTTTTATAAGicATTC^^^^ 



AS6l 

TAACTAGGCATAACACTTATAATCATAACATACT GTmTTCTTACTCCACACACGCATAGACTCTCT GCTAjTAATAAC 

attcatccgtattgtcaatattactattgtatgacaaaaugaaWgtgtccgtaictcacagaccataattattg 



Rss| I 

tatcctcaaaaattgtctacctttagcttttta atttctaaaggggttaataagc 

ATACGAGtTTTTAACACATGGAAATCGAAiAATTAAACATTTCCCCAATTATTCCTTATAAAC^^^^ 

^V',^^C^V^^-;^^.^^,V^v.^^;..'.^.■s^^.^ 
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Bsafi 1 Dra I 

I I 

TACAGATCATAATCACCCAT ACCACATTTGTACACGTTTTACTTGCTTTAAAAAACCTCCCACACCTCCCCCTGAACCTC 

I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1-600 

ATCTCTAGTATTAGTCGGTATGGTCTAAACATCTCCAAAATCAACGAAATTTTTTCCAGCCTCTGCAGGGGCACTTCCAC 

« S„ ■„ L H L ■ R F Y L I • K T S H T S P • T • 
R D H N Q P YH I CRGFTCFKKPPTPPPEP 
LEI I ISHTTFVEVLLALKHLPHLPLNL 
I 1 \ 1 1 1 1 1 1 1 1 1 1 1 1 1 ^ 

•L^DY DA^kGCKYLN • KS ■ FVEWVEGQVQ 
^c^, \.^, , ^ P K V Q K L F G G V G C G S G S 

SIMILIVVMTSTKSAKFFRGCRGRFR 

•"el 

I I 

AAACATAAAATGAATGCAATTGTTCTTGTTA ACTTCTTTATTCCAGCTTATAATCGTTACAAATAAAGCAATAGCATCAC „„„ 
t— H 1 1 1 1 1 1 1 1 1 1 \ 1 H 1 ^ 8B0 

TTTGTATTTTACTTACGTTAACAACAACAATTGAACAAATAACCTCGAATATTACCAATGTTTATTTCGTTATCGTAGTG 

N I K • Q„ L L L T C L L Q L I M V T N K A I AS 
K ^« \, K K ^, ^„ L V Y C S L • W L Q I K Q • H H 
KHKMNAIVVVNLFIAAYNGYK SNSIT 
I 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 h 

^„ "v ^r. "c •*« "a " ' ' K H C S I I T V F L A I ADC 

V Y F S H L Q Q Q • ST • Q L K Y H M C I F C Y C • 
FCLIFAITTTLKNIAA LP LYLLLUV 

Xba 1 

aaatttcacaaataaagcatttttttcact ccattctagttctgctttctccaaactcatcaatctatcttatcatgJct ... 

I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 OBO 

tttaaagtgtttatttcgtaaaaaaagtgacgtaagatcaacaccaaacaggtttgagtagttacatacaatagtacaga 

K K ^ ^„ H C I L V V V C P N S S M Y L I US 

K K K .r ^. ^„ ^ A F • L W F V Q T H Q C I L S C L 

KF TNKAFFSLHSSCGLSKL I NVSYHV 
I 1 1 1 1 1 1 I 1 1 1 1 1 1 \ —I f 

I E C 1 F C, K K ■ Q M R T T T Q C F E D I Y R 1 U D 
^^K « K^, KKK^« ^ ' H N T W V • • H I K 0 H R 
FKVFLANKESCELQPKDL SMLTD- T- 

Sph 1 
Nsi 1 



Bgi n 



agatcttgtgcaatgtgtgtcagttagggt gtggaaagtccccaggctccccagcaggcagaagtatgcaaagcatgcat.... 

• ' ^ 1 1 1 I 1 1 1 1 1 1 1 1 1 H040 

tctagaacaccttacacacagtcaatcccacacctttcagggctccgaggggtcgtccgtcttcatacgtttcgtacgta 

RSCGHCV S GVESPQAPQQAEVCKACI 

• JI.»HVC QLGCGKSPGSPAGRSI(QSUH 

I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ^ 

•-o ''^ 'n ^. ■. T,S_ L G f A G W C A S T H L A H U 

^, K ^u^^K^^^ ^«K^^K^«^ C L S G L L C F Y A F C A D 
IKHFTH - NPHPPDGPECAPLLICLUC 
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S ^ 'l % 's ''h % % % \ S % ''r *, % ^ E V C K A C I S I 
8 L • D A V L H P % % % S S °C ^ '•p \ ^ *C \ *c '^R L 

Nco I 

CAGTCG lUUTATCAGGGCGGGGAW^CCGGlAGCG^GG GGimA^TCAA^CCCG.^^ '''' 



„ S A P W 
S P P H 



1280 



T L '^L \ \ °C ^ ''c L % ^ ^ '^r ^ °r •, S„ « G T (S G 'll r' R i m ' 
D A V « T G *G '^R S *G '^G *C =R \ ^ *g 'l \ \ \ \ \\ W 

Hae m Hae fflP^Ue ffl 

CCACTCATTAAAAAAAATAAATAC^TCTC;:GGCT;:CGGC^G AGC^GGAG;;CTCG;TAAC.lTCTT;;ATCA(^TCCT,^CGAAi[ 

nae lU 
3tu I 

9f^? Ate I 

TTTGGACGCCTAGGCTTTTGCAAAAAGCTCCCT^AaAr.rT TGGCGTAAT CATGCTC^^ 

AA CC CCGGATCCC^AAAA^G m Wc^^^ 

rj'y R *P R *L ^ *Q \ ^ % '•s ^ ^ ^ *p • S » S • L F P V . N 
.F G C L C F C \ \ % % \ % G^ F^^ 

P K Q L F S G ^ S *S % «T S M ^ „ I", % \ \ \ S \ 



K P P R 
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P 



Isrb I 



GTTATCCGCTCACAATTCCACACAACATACGAGCCGCAAGCAT AAACTCTAAACCCTCGCCTCCCTAATCACTCACCTAA 

I I I I 1 I I I I I I I I I I I 1440 

CAATAGCCGAGTGTTAAGGTGTGTTCTATGCTCGGCCTTCGTATTTCACATTTCCGACCCCACGGATTACTCACTCGATT 
^ ^1 % ■•q ■'o 'f 'v *« S I K C K A W G A • • V S • 

;i Is \\\ s \\ \ s \ K ^ '^s;l % \ \ ys e \ \ 

NOA - LBVCCVLBPCLTYLRPHBllSSV 



m 



CTCACATTAATTCCCTTCCGCTCACTCCCCCCTTTCCAG TCGGGAAACCTGTCGTGCCAGCTGCATTAATCAATCGGCCA 

I ' I I I I I I ) 1 I I 1 I I I 1520 

CACTGTAATTAACGCAACGCGACTCACGGGCGAAAGGTCACCCCTTTGGACAGCACGGTCGACGTAATTACTTAGCCGGT 

^ \ 'l *p ''r ''o *, ^ ''o S„ G„ N L S C Q L H • • I G Q 

T "i w ^4 *i ''r. S„ S R E T C R A S C I N E S A 

T H I N C V A L T ARFPVCKPVVPAALUNRP 

I 1 1 1 i 1 1 I I 1 1 1 1 1 1 1 1- 

^ I ^-r '^f, v K K K K K K ^ ^ R A !< <l •» ^ S D A L 

•MLQTASVABKCTPPCTTGAANIFRC 



pap 



I 



ACGCGCGGGGACAGGCGCTTT GCGTATTCGGCGCTCTTCCGCTTCCTCCCTCACTCACTCGCTCCGCTCGCTCCTTCGCC 

I ' I I I I . I I I I I I 1(11 1600 

TCCCCGCCCCTCTCCGCCAAACGCATAACCCGCGAGAAGCCGAACGAGCCACTGACTGACCGACCCGAGCCAGCAAGCCG 

M '^i *p °r "^p °* "^w ''r ''u °n S S„ A S S L T D S L R S V V R 

•^T *R \ ^» *p ^» "-o P R S L T R C A R S F C 
T R G E R R F A Y W A LFRFLAH • LAALGRSA 
— I — — +; f 1 1 1 1 — — f 1 1 1 1 1 1 1 ^ 

''a *p % ""^ '^i ''t ^« ""n ^- E« *- E E S Y S E S R E T T R S 

VRPSLRMAYQASKRKRA • 9SAASPREA 



psrB I 

TCCGGCGACCGCTATCAGCTCACTCAAACCCCCTAATACGCT TATCCACACAATCAGGGGATAACGCAGGAAAGAACATG 

I I 1 I I I I I I I I I 1 I I I ). 1680 

ACGCCGCTCGCCATAGTCGAGTGAGTTTCCCCCATTATGCCAATAGGTGTCTTAGTCCCCTATTGCGTCCTTTCTTGTAC 

A A S O S S L KCCNTVIHRIRG RRKEH 

' ' .' ' ' — — ' 1 1 1 1 1 \ 1 1 1 1 ^ 

aW^9^y^ v ^ ^. *B " K'^ S D P S L A P P F y 

'a -a \ % \ i \ \ L \ \ \ ^ 'l L % 'y \ \ 'f S «T 
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pae m pae in pae m 

TCAGCAAAAGCCCACCAAAAGCCCACG UCCCTAAAAACGCCCCGTTGCTCGCCTTTTTCCA TACCCTCCGCCCCCCTCA 
ACTCCTTTTCCGCTCGmTCCGGTCCTTGCCATWcGCCCCAACGACCGCAAAAACCTATC^ 



CGACCATCACAAAAATCCACGCTCAAGT CAGAGCTGGCGAAACCCCACAGGACTATAAAG ATACCAGGCCTTTCCCCCTG 

HAGGCGGAC 



cctcctagtctttttagctgcgagWctccaccgctttgcgctgtcctgau ' ' 



T W K^,. K K ^ ^ ^ ^ f t C I ■ R Y Q A P P P 

! : ' ' I ' I ' ' " ! ' > °' ^V''^V^. 

R A -D C F D 'V "S L 'd -S '"t ^ *p \ «o 5„ . L. S. V_ L_ R_ T^^<i\J 



\ V \ ^ ^ ^ ^ S •, K K R F C V P S Y L Y ¥ A N 

gaagctccctcctgcgctctcctcttcccacc ctcccgcttaccggatacctgtccgcctttctccc ttccggaagcgtg 

CTTCGAGGGAGCACGCGAGAGGACAAGGGTGGGACGGCGAATGGCCTATGOACAGGCGGAAAGAGGGAACCCCT^^^^ 1920 

ApaL I 

GCCCTTTCTCAATGCTCACGCTGTAGGTATCTCA CTTCGGTCTAGGTCGTTCGCTCCAAGCTGGG CTGTG'rGCACGAACC 
CGCGAAAGAGTTACGAGTGCGACATCCATACACTCAAGCCACATCCACCAAGCGAGGTTCGACCCGACACACGTGCT^^ 2000 

*B L \ ^ \ S S ^ S ^ ^ Y 't T R ^ L 's^ P' Q ; C 's c' 

A K E I S V S % % 'd ^ % «p S "-p % G 

Nci I 

CCCCGTTCAGCCCGACCGCTGCGCCTTATCCGCTA ACTATCGTCTTGAGTCCAACcUcTAAGACAC^ 
GGGGCAAGTCGCGCTCGCGACCCGCAATAGGCCATTGATAGCACAACTCAGGTTGGGCCATTCTCTGCTGAATAW^^^ 2080 

P % % ''s % S "a ^A *P S ^ % \ \ \ \ KKW' R H D L S P 
.P H S ,A R, P L ^R \ \ ^ ^ \ s % K % % % «o K % \ \ \ \ S 

' < I F 1 1 I 1 1 1 1. 



%\ L % ^ \ \ *G ' 
GREARCS B R 



r T '•b ■ I R. S D L G T L C S K 0 G S 

. 't 's S \ \ «, \ % •, \ •/ 



fFig.S'F 
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ae m 



TGCCACCAGCCACTGCTAACACGATTAGCACAGCGAGGTATCTAGCCCCTGCTACACACTTCTTGAAGTCCTGG CCTAAC ^ , 

1 1 1 1 1 1 I 1 1 1 1 1 I I I I 2160 

ACCGTCGTCGGTGACCATTGTCCTAATCGTCTCGCTCCATACATCCGCCACGATGTCTCAAGAACTTCACCACCGGATTG 

LA^A^ATGNR I SRARYVGGATEFLKfWPN 
WQQPLVT^GLAERGU* AVLQSS • SGGLT 
GSSHW* QD QSEVCRRCYRVLEVVA * 

\ 1 i I 1 1 1 1 \ 1 \ 1 i i 1 1 i 

^ A^ AAVPLLILLALYTPPAVSNKPHHGL 
QCCCS^TVPNASRPIYATSCLEQLPPRV 
PLLIQYCS CLSTHLRH LTRSTTA S 

TACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCACTTACCTTCGGAAAAAGAGTTGGTAGCTC ^^^^ 

1 \ 1 1 1 1 I ! 1 1 1 1 1 i 1 h 2240 

ATCCCGATGTGATCTTCCTGTCATAAACCATAGACGCGAGACGACTTCGCTCAATGGAAGCCTTTTTCTCAACCATCCAG 

YGYTRRTVFGICALLKPVTFGKRVGSS 
TATLEGQYLVSALC • SQLPSEKELVA 
LRLH* KDSIWYLRSAEASYLRKKSW* L 

h— H 1 I 1 \ 1 1 1 I 1 1 1 1 I 1 ^ 

P • VLLVTNPIQARSFGTVKPFLTPLE 
VAVSSPCYKTDASQQLVNGESFSNTAR 
RSC FSLIQYRREASAL R R F F L Q Y S 

TTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTCTTTGCAAGCAGCAGATTACCCCCAGAAAAAAAGGAT ^^^^ 

1 1 1 ! \ ! 1 I 1 1 1 1 1 1 1 h 2320 

AACTAGGCCCTTTGTTTGGTGCCGACCATCCCCACCAAAAAAACAAACGTTCGTCGTCTAATGCGCCTCTTTTTTTCCTA 

• SGKQTTACSGGFFVCKQ QITRRKK C 
LDPANKPPLVAVVFLFASSRLRAEKKD 
L IRQTNHRW* RIFFCLQAADYAQKKRI 

I 1 1 H 1 1 1 j 1 1 \ 1 1 1 1 1 I 

Q DPLCVVAP LPPKKTQLCCIVRLFFPD 
SGAFLGGS^TATTKKNALLLNRASFFS 
KIRCVFWRQYRHNKQKCAAS • ACFFLl 



Bspp I 



CTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACCTTAAGGCATTTTGCTCATC^,^^ 

1 \ 1 1 1 1 \ 1 i I 1 1 i I 1 h 2400 

GAGTTCTTCTAGGAAACTACAAAACATGCCCCACACTCCGAGTCACCTTGCTTTTCAGTCCAATTCCCTAAAACCAGTAC 

SQEDPLIFSTG SDAQWNENSR GILVM 
LKKI L SFLRGLTLSGTKTHVKGFWS- 
SRRSFDLFYGV RSVERKLTL RDFGH 

I 1 1 \ 1 1 1 1 I 1 i 1 ! ( I I 1- 

• SSCKIKEVPDSA HFSFER • PIKTM 
RLFtRQQKRRPRVSLPVFV • 7LPNQDH 
ELLDKSRK PTQRETSRFSVNLSKP-S 



|Dra I pra 



I 



AGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTA^,.^ 

I 1 \ 1 1 1 1 1 I 1 1 1 1 \ 1 i K2480 

TCTAATAGTTTTTCCTAGAAGTGGATCTAGGAAAATTTAATTTTTACTTCAAAATTTAGTTAGATTTCATATATACTCAT 

KK^^KK \. ^ L L N • K • S F K S I S I YE • 

« K« S P R^ S F • I K N E V L N Q S K V Y M S 

El IKKDLHLDPFKLKMKF - INLKYI V 

I 1 1 ! 1 1 1 1 1 1 1 1 1 1 I 1 f 

L ^« K« ^« L * K F • F H L K L D I • L I Y S Y 

S * P„ D E C K • I L F S T K F • D L T Y I L L 

IKLFSR RSGKLNFIFN -ILRFYIHT 
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AACTTCGTCTCACACmCCmCCTTAAT CAGTCACCCACCTATCTCAGCCATCTGTCTATTTC CnCATCCA 

ttgaaccagactgtcaatcgmwaanactcactccgtgcatwtcgcm^^^ 

k\ Vp\VVv ^ ^„ % \ 

I +-1h ^- 1 " r ^ ; ° T Y L S D L S I S P I H S C 



■< 1 1 h— — I I 1 f 



% % % S S V 'l "a ^. 'd S \ *C '^R 'n ^ *» » K » m' T k 

F K T Q C N G I *S L ''p S ^ °R \ «s ^ I '^E \ 11 S \ "q 

pie m 

cctgactccccctcctgtacataactacgatacggcaggccttaccatctcgU 



CGACTCAGGGGCACCACATCTATTGATGCTATGCCCTCCCCAATCGTAGACCGCCGTCAro^^^^ 2640 
Q- S G T T Y I V V I 



1 + 1 1 h 



C S E C D H L Y S R 'y "p \ \ * K \ ^ K ^ 
B V G R R T S\'.\\%\', v \ \ *G \ \ \ \ \\ \ \ \ 

fiae ffl Ava n 

CCACGCTCACCCCCTCCACATTTATCAGCAATAAA CCAGCCAGCC GGAALGG'cCGAGCGCAGAAr.TnU 
GGTCCGAGTGCCCGAGGTCTAAATACTrolTATTlGGT C^CTCGCCCTTCCCGGCTTO "^0 
''h \ \ \ \ ^ S S A I N P. A C R A E R R S G P A T I 

p t.L T G.s R fVs^ K;pysyK'^a''R^^^K\%''c%s 

Va 'r *s ^ ^ ^ *c\ < \ aJp l a^ i R 'l l' p i a 'v k' 
.Asel fJ" I Fsp I 

. :CCGCAAGCTAGAGTAAGTAGTTCCCC AGTTAATACTTT(;Jr.rAArRTTP. 
TAGGCGGAGGTACGTCAGATAAmACAACCGCCCTTCGATCTCATTCATCAACrcGTcJ -- ' ' ^ ' 



atccccctccatccagtctat^aattgttccJggca agctagagtaagtacttccccagttaa ^ 

-AGGCGGAGOTACGTCAGATAATTAACAACCGCCCTTCGATCTCATTCATCAAGiGGTCMTTATCAAAiGl 
% *P % 's 'l V % ^ «G \ K ^ ^ ^ ^ ^ ^ N S L R H 

V. ^ «n *. K I . k Q. Q. R S 



TTCCCATTCCTACACGCATCGTGGTGTCACGCTCGT CGTTTGGTA TGGCTTCATTCACCT^ 

AACGGTAACGATGTCCCTACCACCACAGTGCOAGCACCAAACCATACCCAAGTAAGTC^^^^ 2B80 

\ *p 'l \ % ^ 's % ^ 'h "a ^ ^ °» "w *, s p s s g s q r s r 

.C.H C^Y R H^R O^vVl V^^yc^H,^,A^P,%Pp^ Q.C 

'C ^ «D ° ^ ^R "^R "k % 'h k' L i P 'e w' R i L ' 

Q W Q . L C B % «T V S «T «T ^ \ «p „ B A^ G^ P^ 



SUBSTITUTE SHEET (RULE 26) 



wo 00/22129 



PCT/US99/23938 



11/17 



km n .Pvu I H|ie in 



CCAGTTACATCATCCCCCATOTTGTCCAAAAAACCCGTTAGCTCCTTCGGTCCTCCCATCGTTCTCACAAGTAAGTTCGC 

1 1 1 1 I 1 1 1 I 1 1 I 1 1 1 1- 2960 

GCTCAATGTACTAGGCCCTACAACACGTTTTTTCGCCAATCCAGGAACCCAGCAGGCTAGCAACAGTCTTCATTCAACCG 

^. T„ \. U C K K A V S S P G P P I V V R S K L A 

L„ H D P, C C A^ K K R L A P S V L R S L S E V S W 
ASYUIPHVVQKSG LLRSSDRCQK VG 

I 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1- 

R T„ V H„ H L f A T L E K P G C I T T L L L H A 

S N C S C_ H A^ P F R N A C E T R R D D S T L Q G 
L UIGWTTCFLP SRRDESRQ • FTTP 

CGCAGTCTTATCACTCATCGTTATG GCAGCACTGCATAATTCTCTTACTGTCATGCCATCCCTAAGATGCTTTTCTCTGA 

— H 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1- 3040 

GCCTCACAATAGTCACTACCAATACCGTCGTGACGTATTAAGAGAATGACAGTACGGTAGCCATTCTACGAAAAGACACT 

A« ^» '•^ S. L„ M_ V M_ A^ A I H N S L T V II P S V R C F S V 
P Q C Y H f L„ W (J H C I I L L L S C H P ■ D A P L • 
RSVITHGYGSTA FSYCHAIRKMLFCD 

I ^ 1 I 1 1 1 1 1 1 1 1-^ 1 1 1 1 h 

A^ T N D S y T„ I A A S C L E R V T y G D T L H K E T V 
C H • E H H H C C Q M I R K S D H f G Y S A K R^ H 
RLTIV - P PLVAYNE Q - AHRLISKQS 

Rsa I 

pea I Nci I Hinc n 

C TCCTGAGTACTCAACCAAC TCATTCTGAGAATAGTGTATCCGGCGACCGAGTTGCTCTTGCCCGGCGTCAACACGCCAT „ . 

1 1 I 1 1 1 1 1 1 1 1 1 1 1 I 1- 3120 

GACCACTCATCAGTTGGTTCACTAACACTCTTATCACATACGCCCCTCCCTCAACCACAACCGCCCCCAGTTGTGCCCTA 

"f. T« F^ • E • C y R R P S C S C P A S T R D 

S„ ^„ H S E H S V C 6 D R V A L A R R Q H G I 

W VLNQVILRIVYAATELL LPGVHTC 

I 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 f 

P S Y, B V L D N„ S Y H I R R G L Q E Q G A 0 V R S 
S„ T„ L V • G L • S F L T„ H P S R T A R A R R • C P I 
QHTSLWTyRLlTYAAVSNSKGPTLVPY 



pra I Xmn I 



I 

AATACCCCCCCACATACCAGAACTTTA AAAGTCCTCATCATTGGAAAACCTTCTTCCCCGCGAAAACTCTCAACGATCTT 

I 1 \ 1 1 1 1 1 1 1 ( 1 1 1 1 1 ^ 3200 

TTATCGCGCGGTCTATCGTCTTCAAATTTTCACGAGTAGTAACCTTTTGCAAGAAGCCCCGCTTTTGAGAGTTCCTAGAA 

''^ *„ P.. S R„ T L K V L I 1 G K R S S G R K L S R I L 
L ' K K K - K C S S L E N V L R G E N S Q G S 
•YRAT • QNFKSAHHWKTFFGAKTLKDL 
H-H 1 1 1 1 1 -H 1 1 1 1 1 1 1 1 k 

^, ^» K^^^ J'- T S y y P F R E E P R F S E L 1 K 

•v^a S S • FH E 0 H S F T R R P S F E • P D • 

YRAVYCFKLLA- QF VNKPAPVRLSR 



fig. 31 
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I 



ACCCCTCTTGACATCCACTTCCATGTAACCCACTC^^^^^ ^^ 

»ccctcTciimT^TCclTcc oW^«,ccWccc^T,c^ccciiTacc^>c;.^^^ 

I C K *« 'k '9 °B "o 'k "u S *0 "v «» ' « A t R l[ c I I , I 



-« 1 1 1- 

SMS 
YE 
V R 



TTCCTTTTTCAATATTATTCAACCATTTATCACCGTTATT. TCTm^ . 
AACCAAAAA(^TTAT;aTAA(^TTC cW1cTCC^^^^^^ 3440 



avrNND AEQCQNNVL> 



H 1 K- 

K R K • y . Q I If . T 

G K E I 



\ \ ^ 'l ^, n ^ » 



re in 
Bfil I 

; SI T G S L V H spy 



K ^ S„ A_ L H N_ L R 



U A R 



"« "p 'i '» '0 \ ' 

1 t s ? t r p c 0 



!F/f.Jf 
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TGCCTGACCGCCCAACCACCCCCCCCCATTGACCTCAATAAT GACCTATGTTCCCATAGTAACCCCAATACCGACTTTCC 

I 1 I • 1 1 I 4-^ 1 I I I I I I I 3600 

ACCCACTGGCGGGTTGCTGCCGGCGGGTAACTGCAGTTATTACTGCATACAAGGGTATCATTGCGGTTATCCCTGAAAGG 

LADRPTTPAH-RQ - •RliPP - - RO-BLS 
I L T A Q R P P P I D V M N B V C S H S HAH R D P P 
. C • P P N D P R P L T S I "ll T Y V P "l f T *P "l 0 "t P ' 

I > 1 1 1 1 1 1 1 1 1 1 1 1 1 h 

^A^SRGVVGAIQR • YHR I NGYYRWYPSE 

\ % *a \ \ % \ % -H \ % S \ % ^ ^V\V^VS VVk\ 
n M I Rsa I Nde I Rsa I 

ATTGACCTCAATGCCTGGACTATTTACGGTAAACTCCCCACTTGGC ACTACATCAACTGTATCATATGCCAAGTACGCCC 

' * ■ ' "I ~^H ~^^ i — I I I I j j i ' I 3680 

TAACTCCAGTTACCCACCTCATAAATGCCATTTGACCGCTGAACCGTCATCTACTTCACATAGTATACGGTTCATCCCCG 
IDVNCWTIYGKLPTYQYIKCI ICQVRP 

QR HTS-KRYVAfKATC TY - I61VG 

rae m 
pgl I Rsa I 

ccTAfTtiAUUTLAATUACGGTAAATGGCCCCCCTCCCATT ATCCCCACTACATCACCTTATGCCACTTTCCTACTTCGCA 

' ' ' ' f I I I I I 1 I I I I 3760 

GGATAACTGCACTTACTCCCATTTACCGGGCCGACCGTAATACGGGTCATGTACTGGAATACCCTGAAACCATGAACCGT 

p '•v ^ ^0 " ^ ^ ''a ^ ^ S T P Y G T T L L G 

» ^1 n M % ^» ^ A L C P V H D L U C L S Y L A 

P D V H D C K W P A f H Y A Q Y M T L f D P P T ¥ 0 

I 1 1 1 1 1 1 1 H 1 1 1 1 1 \ 1 ^ 

r I T I % \ 'n *n ''a " T„ C S R 1 P S E • K A 

GI STLSPLHGAQC • AWYUVKHSKCVQC 

BsaA I Nco I 

psa I pnaB I pty I pa I 

CTACATCTACGTATTACTCATCGCTATTACCATGGTGAT GCCGTTTTGGCAGTACATCAATCGCCCTGGATAGCCCTTTC 

'III ' ' ' I I I I I I I I I i 3B40 

CATCTAGATCCATAATCAGTAGCGATAATGGTACCACTACCCCAAAACCCTCATGTAGTTACCCCCACCTATCCCCAAAC 
■ V V y J *. ' T M VMRFWQYINGRC RF 

' 1 ' ^ 1 1 1 1 1 1 1 1 \ 1 I 1 ^ 

T ^ I " °- «c P ^ V D I P T S L P K 

\ \ N S 'u\ l V \ % ^ \ \ \ \ I \ *R «P \ *R ^H % 
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AaX n 

TaACTCCCCCTAAACCTTCACACC^CC(K;lAACT^CACT; ACCC;CAAA^AAAA^CCTG.;m^^ '''' 
Jl S % ^ % \ % \ ''h % ^0 «, ^ ^ C P G T K I H G T P Q 

E R P M C I R f c M ''s T L "p \ "k % '^d 'c % ^ % % 'k 

Bsa I Sac I 

AAATCTCCTAACAACTCCGCCCCATTGACCCAAATG GCCCGTACC CGTGT'ACGGTCCCAnr.^^ L 

mACAGCmGT^^^^^^^ 4000 

jRsft I 

CTGGCTAACTAGAGAACCCACTGCmACTCGCTTATCGA AATU^^ 

GACCGATTCATCTCTTGGGTGACGiATTG^CCCA^TACCTTTAATTATGCTCAGTCATAlcCCTCTGCC**"" 
V*. '^t'- 'h \ \ \ \ \ 's \ \ \ \ \ \ Y H E T 



SUBSTITUTE SHiXT (RULE 26) 



wo 00/22129 



PCTAJS99/23938 



15/17 



150n 
125 
100- 
75- 
50 
25 
0 



FTTl 



i 



CMV GPR6 GPR6 

GPR30 (K256) 
Expression plamid 



7000 
6000 
5000 
4000 
3000 
2000 
1000 
0 

















1 ~ 


■ 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: • Behan, Dominic P. 

Chalmers, Derek T. 
Liaw, Chen W. 

(ii) TITLE OF INVENTION: Non- Endogenous, Cons ti tut ively 

Activated Human G Protein- Coupled 
Orphan Receptors 

(iii) NUMBER OF SEQUENCES: 280 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Arena Pharmaceuticals, Inc. 

(B) STREET: 6166 Nancy Ridge Drive 

(C) CITY: San Diego 

(D) STATE : CA 

(E) COUNTRY: USA 

(F) ZIP: 92122 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 
<C) OPERATING SYSTEM: PC-DOS /MS -DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Burgoon, Richard P. 

(B) REGISTRATION NUMBER: 34,787 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (619)453-7200 

(B) TELEFAX: (619)453-7210 

(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1068 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 



ATGGAAGATT TGGAGGAAAC ATTATTTGAA GAATTTGAAA ACTATTCCTA 
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2 



TATTACTCTC 


TGGAGTCTGA 


TTTGGAGGAG 


AAAGTCCAGC 


TGGGAGTTGT 


TCACTGGGTC 


120 


TCCCTGGTGT 


TATATTGTTT 


GGCTTTTGTT 


CTGGGAATTC 


CAGGAAATGC 


CATCGTCATT 


180 


TGGTTCACGG 


GGCTCAAGTG 


GAAGAAGACA 


GTCACCACTC 


TGTGGTTCCT 


CAATCTAGCC 


240 


ATTGCGGATT 


TCATTTTTCT 


TCTCTTTCTG 


CCCCTGTACA 


TCTCCTATGT 


GGCCATGAAT 


300 


TTCCACTGGC 


CCTTTGGCAT 


CTGGCTGTGC 


AAAGCCAATT 


CCTTCACTGC 


CCAGTTGAAC 


360 


ATGTTTGCCA 


GTGTTTTTTT 


CCTGACAGTG ATCAGCCTGG ACCACTATAT 


CCACTTGATC 


420 


CATCCTGTCT 


TATCTCATCG 


GCATCGAACC 


CTCAAGAACT 


CTCTGATTGT 


CATTATATTC 


480 


ATCTGGCTTT 


TGGCTTCTCT 


AATTGGCGGT 


CCTGCCCTGT 


ACTTCCGGGA 


CACTGTGGAG 


540 


TTCAATAATC 


ATACTCTTTG 


CTATAACAAT 


TTTCAGAAGC 


ATGATCCTGA 


CCTCACTTTG 


600 


ATCAGGCACC 


ATGTTCTGAC 


TTGGGTGAAA TTTATCATTG 


GCTATCTCTT 


CCCTTTGCTA 


660 


ACAATGAGTA 


TTTGCTACTT 


GTGTCTCATC 


TTCAAGGTGA 


AGAAGCGAAC 


AGTCCTGATC 


720 


TCCAGTAGGC 


ATTTCTGGAC 


AATTCTGGTT 


GTGGTTGTGG 


CCTTTGTGGT 


TTGCTGGACT 


780 


CCTTATCACC 


TGTTTAGCAT 


TTGGGAGCTC 


ACCATTCACC 


ACAATAGCTA 


TTCCCACCAT 


840 


GTGATGCAGG 


CTGGAATCCC 


CCTCTCCACT 


GGTTTGGCAT 


TCCTCAATAG 


TTGCTTGAAC 


900 


CCCATCCTTT 


ATGTCCTAAT 


TAGTAAGAAG 


TTCCAAGCTC 


GCTTCCGGTC 


CTCAGTTGCT 


960 


GAGATACTCA ' AGTACACACT 


GTGGGAAGTC AGCTGTTCTG 


GCACAGTGAG 


TGAACAGCTC 


1020 


AGGAACTCAG 


AAACCAAGAA 


TCTGTGTCTC 


CTGGAAACAG 


CTCAATAA 




1068 



(3) INFORMATION FOR SEQ ID NO:2: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Glu Asp Leu Glu Glu Thr Leu Phe Glu Glu Phe Glu Asn Tyr Ser 
1 5 10 '15 

Tyr Asp Leu Asp Tyr Tyr Ser Leu Glu Ser Asp Leu Glu Glu Lys Val 
20 25 30 

Gin Leu Gly Val Val His Trp Val Ser Leu Val Leu Tyr Cys Leu Ala 
35 40 45 
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Phe val Leu Gly He Pro Gly Asn Ala He Val He Trp Phe Thr Gly 

55 60 

Leu Lys Trp Lys Lys Thr Val Thr Thr Leu Trp Phe Leu Asn Leu Ala 
" 75 80 

He Ala Asp Phe He Phe Leu Leu Phe Leu Pro Leu Tyr He Ser Tyr 
85 90 95 

val Ala Met Asn Phe His Trp Pro Phe Gly He Trp Leu Cys Lys Ala 



105 



110 



Asn ser Phe Thr Ala Gin Leu Asn Met Phe Ala Ser Val Phe Phe 



115 



120 



Leu 



125 



Thr Val He Ser Leu Asp His Tyr He His Leu He His 



130 



Pro Val Leu 



135 

Ser His Arg His Arg Thr Leu Lys Asn Ser Leu He Val He II 



145 



150 



155 



e Phe 
160 



He Trp Leu Leu Ala Ser Leu He Gly Gly Pro Ala Leu Tyr Phe Arg 
165 170 

Asp Thr val Glu Phe Asn Asn His Thr Leu Cys Tyr Asn Asn Phe Gin 
180 190 

Lys His Asp Pro Asp Leu Thr Leu He Arg His His Val Leu Thr Trp 
195 200 205 

Val Lys Phe He He Gly Tyr Leu Phe Pro Leu Leu Thr Met Ser He 



210 



215 



220 



cys Tyr Leu Cys Leu He Phe Lys Val Lys Lys Arg Thr Val Leu 



225 



230 



235 



ser Ser Arg His Phe Trp Thr He Leu Val Val Val Val Ala Phe 



245 



250 



255 



Val Cys Trp Thr Pro Tyr His Leu Phe Ser He Trp Glu Leu Thr 



260 



265 



He 
240 

Val 



He 



270 



His His Asn Ser Tyr Ser His His Val Met Gin Ala Gly He Pro Leu 

280 285 



275 



ser Thr Gly Leu Ala Phe Leu Asn Ser Cys Leu Asn Pro He Leu Tyr 



295 



300 



val Leu He Ser Lys Lys Phe Gin Ala Arg Phe Arg Ser Ser Val Ala 



310 



315 



Glu He Leu Lys Tyr' Thr Leu Trp Glu Val Ser Cys Ser Gly Thr 



325 



330 



320 



Val 



335 



ser Glu Gin Leu Arg Asn Ser Glu Thr Lys Asn Leu Cys Leu Leu Glu 
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340 345 350 

Thr Ala Gin 
355 

(4) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1089 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATGGGCAACC ACACGTGGGA GGGCTGCCAC GTGGACTCGC GCGTGGACCA CCTCTTTCCG 60 

CCATCCCTCT ACATCTTTGT CATCGGCGTG GGGCTGCCCA CCAACTGCCT GGCTCTGTGG 120 

GCGGCCTACC GCCAGGTGCA ACAGCGCAAC GAGCTGGGCG TCTACCTGAT GAACCTCAGC 180 

ATCGCCGACC TGCTGTACAT CTGCACGCTG CCGCTGTGGG TGGACTACTT CCTGCACCAC 240 

GACAACTGGA TCCACGGCCC CGGGTCCTGC AAGCTCTTTG GGTTCATCTT CTACACCAAT 300 

ATCTACATCA GCATCGCCTT CCTGTGCTGC ATCTCGGTGG ACCGCTACCT GGCTGTGGCC 360 

CACCCACTCC GCTTCGCCCG CCTGCGCCGC GTCAAGACCG CCGTGGCCGT GAGCTCCGTG 420 

GTCTGGGCCA CGGAGCTGGG CGCCAACTCG GCGCCCCTGT TCCATGACGA GCTCTTCCGA 480 

GACC6CTACA ACCACACCTT CTGCTTTGAG AAGTTCCCCA TGGAAGGCTG GGTGGCCTGG 540 

ATGAACCTCT ATCGGGTGTT CGTGGGCTTC CTCTTCCCGT GGGCGCTCAT GCTGCTGTCG 600 

TACCGGGGCA TCCTGCGGGC CGTGCGGGGC AGCGTGTCCA CCGAGCGCCA GGAGAAGGCC 660 

AAGATCAAGC GGCTGGCCCT CAGCCTCATC GCCATCGTGC TGGTCTGCTT TGCGCCCTAT 720 

CACGTGCTCT TGCTGTCCCG CAGCGCCATC TACCTGGGCC GCCCCTGGGA CTGCGGCTTC 780 

GAGGAGCGCG TCTTTTCTGC ATACCACAGC TCACTGGCTT TCACCAGCCT CAACTGTGTG 840 

GCOGACCCCA TCCTCTACTG CCTGGTCAAC GAGGGCGCCC GCAGCGATGT GGCCAAGGCC 900 

CTGCACAACC TGCTCCGCTT TCTGGCCAGC GACAAGCCCC AGGAGATGGC CAATGCCTCG 960 

CTCACCCTGG AGACCCCACT CACCTCCAAG AGGAACAGCA CAGCCAAAGC CATGACTGGC 1020 

AGCTGGGCGG CCACTCCGCC TTCCCAGGGG GACCAGGTGC AGCTGAAGAT GCTGCCGCCA 1080 

GCACAATGA 1089 
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(5) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 362 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
Met Gly Asn His Thr Trp Glu Gly Cys His Val Asp Ser Arg Val Asp 



5 10 



15 



His Leu Phe Pro Pro Ser Leu Tyr He Phe Val He Gly Val Gly Leu 



20 25 



30 



Pro Thr Asn Cys Leu Ala Leu Trp Ala Ala Tyr Arg Gin Val Gin Gin 
35 40 45 

Arg Asn Glu Leu Gly Val Tyr Leu Met Asn Leu Ser He Ala Asp Leu 
^° 55 60 



Leu Tyr He Cys Thr Leu Pro Leu Trp Val Asp Tyr Phe Leu His His 
" ^= 80 

Asp Asn Trp He His Gly Pro Gly Ser Cys Lys Leu Phe Gly Phe He 
85 90 95 

Phe. Tyr Thr Asn He Tyr He Ser He Ala Phe Leu Cys Cys He Ser 
100 105 

val ASP Arg Tyr Leu Ala Val Ala His Pro Leu Arg Phe Ala Arg Leu 



115 120 



125 



Arg Arg Val Lys Thr Ala Val Ala Val Ser Ser Val Val Trp Ala Thr 
130 135 

Glu Leu Gly Ala Asn Ser Ala Pro Leu Phe His Asp Glu Leu Phe Arg 

"° 155 160 

Asp Arg Tyr Asn His Thr Phe Cys Phe Glu Lys Phe Pro Met Glu Gly 



165 170 



175 



Trp Val Ala Trp Met Asn Leu Tyr Arg Val Phe Val Gly Phe Leu Phe 
180 185 190 

Pro Trp Ala Leu Met Leu Leu Ser Tyr Arg Gly He Leu Arg Ala Val 
135 200 205 

Arg Gly Ser Val Ser Thr Glu Arg Gin Glu Lys Ala Lys He Lys Arg 



220 



Leu Ala Leu Ser Leu He Ala He Val Leu Val Cys Phe Ala Pro Tyr 



wo 00/22129 



PCT/US99/23938 



225 230 235 240 

His Val Leu Leu Leu Ser Arg Ser Ala lie Tyr Leu Gly Arg Pro Trp 
245 250 255 

Asp Cys Gly Phe Glu Glu Arg Val Phe Ser Ala Tyr His Ser Ser Leu 
260 265 270 

Ala Phe Thr Ser Leu Asn Cys Val Ala Asp Pro lie Leu Tyr Cys Leu 

275 280 285 

Val Asn Glu Gly Ala Arg Ser Asp Val Ala Lys Ala Leu His Asn Leu 
290 295 300 

Leu Arg Phe Leu Ala Ser Asp Lys Pro Gin Glu Met Ala Asn Ala Ser 
305 310 315 320 

Leu Thr Leu Glu Thr Pro Leu Thr Ser Lys Arg Asn Ser Thr Ala Lys 
325 330 335. 

Ala Met Thr Gly Ser Trp Ala Ala Thr Pro Pro Ser Gin Gly Asp Gin 
340 345 350 

Val Gin Leu Lys Met Leu Pro Pro Ala Gin 
355 360 

(6) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH.: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TATGAATTCA GATGCTCTAA ACGTCCCTGC 30 

(7) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TCCGGATCCA CCTGCACCTG CGCCTGCACC 30 

(8) INFORMATION FOR SEQ ID NO: 7: 
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* (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1002 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 7: 






ATGGAGTCCT 


CAGGCAACCC 


AGAGAGCACC 


ACCTTTTTTT ACTATGACCT 


TCAGAGCCAG 


60 


CCGTGTGAGA ACCAGGCCTG GGTCTTTGCT 


ACCCTCGCCA CCACTGTCCT 


GTACTGCCTG 


. 120 


GTGTTTCTCC 


TCAGCCTA6T 


GGGCAACAGC 


CTGGTCCTGT GGGTCCTGGT 


GAAGTATGAG 


180 


AGCCTGGAGT 


CCCTCACCAA 


CATCTTCATC 


CTCAACCTGT GCCTCTCAGA 


CCTGGTGTTC 


240 


GCCTGCTTGT 


TGCCTGTGTG 


GATCTCCCCA 


TACCACTGGG GCTGGGTGCT 


GGGAGACTTC 


300 


CTCTGCAAAC 


TCCTCAATAT 


GATCTTCTCC 


ATCAGCCTCT ACAGCAGCAT 


CTTCTTCCTG 


360 


ACCATCATGA 


CCATCCACCG 


CTACCTGTCG 


GTAGTGAGCC CCCTCTCCAC 


CCTGCGCGTC 


420 


CCCACCCTCC 


GCTGCCGGGT 


GCTGGTGACC 


ATGGCTGTGT GGGTAGCCAG 


CATCCTGTCC 


480 


TCCATCCTCG 


ACACCATCTT 


CCACAAGGTG 


CTTTCTTCGG GCTGTGATTA 


TTCCGAACTC 


540 


ACGTGGTACC 


TCACCTCCGT 


CTACCAGCAC 


AACCTCTTCT TCCTGCTGTC 


CCTGGGGATT 


600 


ATCCTGTTCT 


GCTACGTGGA 


GATCCTCAGG 


ACCCTGTTCC GCTCACGCTC 


CAAGCGGCGC 


660 


CACCGCACGG 


TCAAGCTCAT 


CTTCGCCATC 


GTGGTGGCCT ACTTCCTCAG 


CTGGGGTCCC 


720 


TACAACTTCA 


CCCTGTTTCT 


GCAGACGCTG 


TTTCGGACCC AGATCATCCG 


GAGCTGCGAG 


780 


GCCAAACAGC 


AGCTAGAATA 


CGCCCTGCTC 


ATCTGCCGCA ACCTCGCCTT 


CTCCCACTGC 


840 


TGCTTTAACC 


CGGTGCTCTA 


TGTCTTCGTG 


GGGGTCAAGT TCCGCACACA 


CCTGAAACAT 


900 


GTTCTCCGGC 


AGTTCTGGTT 


CT6CCGGCTG 


CAGGCACCCA GCCCAGCCTC GATCCCCCAC 


960 


TCCCCTGGTG 


CCTTCGCCTA 


TGAGGGCGCC 


TCCTTCTACT GA 




1002 


(9) INFORMATION FOR SEQ ID NO: 8: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
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Met Glu Ser Ser Gly Asn Pro Glu Ser Thr Thr Phe Phe Tyr Tyr Asp 
1 5 10 15 

Leu Gin Ser Gin Pro Cys Glu Asn Gin Ala Trp Val Phe Ala Thr Leu 
20 25 30 

Ala Thr Thr Val Leu Tyr Cys Leu Val Phe Leu Leu Ser Leu Val Gly 
35 40 45 

Asn Ser Leu Val Leu Trp Val Leu Val Lys Tyr Glu Ser Leu Glu Ser 
50 55 60 

Leu Thr Asn He Phe He Leu Asn Leu Cys Leu Ser Asp Leu Val Phe 
65 70 75 80 

Ala Cys Leu Leu Pro Val Trp He Ser Pro Tyr His Trp Gly Trp Val 
85 90 95 

Leu Gly Asp Phe Leu Cys Lys Leu Leu Asn Met He Phe Ser He Ser 
100 105 110 

Leu Tyr Ser Ser He Phe Phe Leu Thr He Met Thr He His Arg Tyr 
lis 120 125 

Leu Ser Val Val Ser Pro Leu Ser Thr Leu Arg Val Pro Thr Leu Arg 
130 135 140 

Cys Arg Val Leu Val Thr Met Ala Val Trp Val Ala Ser He Leu Ser 
145 150 155 160 

Ser He Leu Asp Thr He Phe His Lys Val Leu Ser Ser Gly Cys Asp 
165 170 175 

Tyr Ser Glu Leu Thr Trp Tyr Leu Thr Ser Val Tyr Gin His Asn Leu 
180 185 190 

Phe Phe Leu Leu Ser Leu Gly He He Leu Phe Cys Tyr Val Glu He 
195 200 205 

Leu Arg Thr Leu Phe Arg Ser Arg Ser Lys Arg Arg His Arg Thr Val 
210 215 220 

Lys Leu He Phe Ala He Val Val Ala Tyr Phe Leu Ser Trp Gly Pro 
225 230 235 240 

Tyr Asn Phe Thr Leu * Phe Leu Gin Thr Leu Phe Arg Thr Gin He He 
245 250 255 

Arg Ser Cys Glu Ala Lys Gin Gin Leu Glu Tyr Ala Leu Leu He Cys 
260 265 270 



Arg Asn Leu Ala Phe Ser His Cys Cys Phe Asn Pro Val Leu Tyr Val 
275 280 285 
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Phe Val Gly Val Lys Phe Arg Thr His Leu Lys His Val heu Arg Gin 

295 300 

Phe Trp Phe Cys Arg Leu Gin Ala Pro Ser Pro Ala Ser He Pro His ' 
305 310 

315 320 

Ser Pro Gly Ala Phe Ala Tyr Glu Gly Ala Ser Phe Tyr 
325 330 

(10) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
15 GCAAGCTTGG GGGACGCCAG GTCGCCGGCT 

(11) INFORMATION FOR SEQ ID NO: 10: 



0 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



I 



30 



31 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GCGGATCCGG ACGCTGGGGG AGTCAGGCTG C 
> (12) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 987 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATGGACAACG CCTCGTTCTC GGAGCCCTGG CCCGCCAACG CATCGG6CCC GGACCCGGCG 60 
CTGAGCTGCT CCAACGCGTC GACTCTGGCG CCGCTGCCGG CGCCGCTGGC GGTGGCTGTA 120 
CCAGTTGTCT ACGCGGTGAT CTGCGCCGTG GGTCTGGCGG GCAACTCCGC CGTGCTGTAC • 180 
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GTGTTGCTGC GGGCGCCCCG CATGAAGACC GTCACCAACC TGTTCATCCT CAACCTGGCC 240 

ATCGCCGACG AGCTCTTCAC GCTGGTGCTG CCCATCAACA TCGCCGACTT CCTGCTGCGG 300 

CAGTGGCCCT TCGGGGAGCT CATGTGCAAG CTCATCGTGG CTATCGACCA GTACAACACC 360 

TTCTCCAGCC TCTACTTCCT CACCGTCATG AGCGCCGACC GCTACCTGGT GGTGTTGGCC 420 

ACTGCGGAGT CGCGCCGGGT GGCCGGCCGC ACCTACAGCG CCGCGCGCGC GGTGAGCCTG 480 

GCCGTGTGGG GGATCGTCAC ACTCGTCGTG CTGCCCTTCG CAGTCTTCGC CCGGCTAGAC 540 

GACGAGCAGG GCCGGCGCCA GTGCGTGCTA GTCTTTCCGC AGCCCGAGGC CTTCTGGTGG 600 

CGCGCGAGCC GCCTCTACAC GCTGGTGCTG GGCTTCGCCA TCCCCGTGTC CACCATCTGT 660 

GTCCTCTATA CCACCCTGCT GTGCCGGCTG CATGCCATGC GGCTGGACAG CCACGCCAAG 720 

GCCCTGGAGC GCGCCAAGAA GCGGGTGACC TTCCTGGTGG TGGCAATCCT GGCGGTGTGC 780 

CTCCTCTGCT GGACGCCCTA CCACCTGAGC ACCGTGGTGG CGCTCACCAC CGACCTCCCG 840 

CAGACGCCGC TGGTCATCGC TATCTCCTAC TTCATCACCA GCCTGACGTA CGCCAACAGC 900 

TGCCTCAACC CCTTCCTCTA CGCCTTCCTG GACGCCAGCT TCCGCAGGAA CCTCCGCCAG 960 

CTGATAACTT GCCGCGCGGC AGCCTGA 987 
(13) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 328 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Asp Asn Ala Ser Phe Ser Glu Pro Trp Pro Ala Asn Ala Ser Gly 
15 10 15 

Pro Asp Pro Ala Leu Ser Cys Ser Asn Ala Ser Thr Leu Ala Pro Leu 
20 25 30 

Pro Ala Pro Leu Ala Val Ala Val Pro Val Val Tyr Ala Val lie Cys 
35 40 45 

Ala Val Gly Leu Ala Gly Asn Ser Ala Val Leu Tyr Val Leu Leu Arg 
50 55 60 

Ala Pro Arg Met Lys Thr Val Thr Asn Leu Phe lie Leu Asn Leu Ala 
65 70 75 80 
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lie Ala Asp Glu Leu Phe Thr Leu Val Leu Pro lie Asn He Ala Asp 
85 90 35 

Phe Leu Leu Arg Gin Trp Pro Phe Gly Glu Leu Met Cys Lys Leu He 

■^^^ 110 

val Ala lie Asp Gin Tyr Asn Thr Phe Ser Ser Leu Tyr Phe Leu Thr 

120 

val Met ser Ala Asp Arg Tyr Leu Val Val Leu Ala Thr Ala Glu Ser 



"° "5 
Arg Val Ala Gly Arg Thr Tyr Ser Ala Ala Arg Ala Val Ser Leu 

Ala Val Trp Gly He Val Thr Leu Val Val Leu Pro Phe Ala Val Phe 
165 



170 



175 



Ala Arg Leu Asp Asp Glu Gin Gly Arg Arg Gin Cys Val Leu Val Phe 
"° 185 



Pro Gin Pro Glu Ala Phe Trp Trp Arg Ala Ser Arg Leu Tyr Thr Leu 

200 205 

val Leu Gly Phe Ala He Pro Val Ser Thr lie Cys Val Leu Tyr Thr 

• 220 

Thr Leu Leu Cys Arg Leu His Ala Met Arg Leu Asp Ser His Ala Lys 



235 



240 



Ala Leu Glu Arg Ala Lys Lys Arg Val Thr Phe Leu Val Val Ala He 
245 



250 



255 



Leu Ala val Cys Leu Leu Cys Trp Thr Pro Tyr His Leu Ser Thr Val 

265 270 

val Ala Leu Thr Thr Asp Leu Pro Gin Thr Pro Leu Val He Ala He 
275 -280 285 

ser Tyr Phe He Thr Ser Leu Thr Tyr Ala Asn Ser Cys Leu 



295 



Asn Pro 



300 



Phe Leu ryr Ala Phe Leu Asp Ala Ser Phe Arg Arg Asn Leu Arg Gin 

315 

Leu He Thr Cys Arg Ala Ala Ala 
325 

(14) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CGGAATTCGT CAACGGTCCC AGCTACAATG 30 

(15) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

ATGGATCCCA GGCCCTTCAG CACCGCAATA T 31 

(16) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1002 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA . (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATGCAGGCCG CTGGGCACCC AGAGCCCCTT GACAGCAGGG GCTCCTTCTC CCTCCCCACG 60 

ATGGGTGCCA ACGTCTCTCA GGACAATGGC ACTGGCCACA ATGCCACCTT CTCCGAGCCA ^ 120 

CTGCCGTTCC TCTATGTGCT CCTGCCCGCC GTGTACTCCG GGATCTGTGC TGTGGGGCTG 180 

ACTGGCAACA CGGCCGTCAT CCTTGTAATC CTAAGGGCGC CCAAGATGAA GACGGTGACC 240 

AACGTGTTCA TCCTGAACCT GGCCGTCGCC GACGGGCTCT TCACGCTGGT ACTGCCCGTC 300 

AACATCGCGG AGCACCTGCT GCAGTACTGG CCCTTCGGGG AGCTGCTCTG CAAGCTGGTG 360 

CTGGCCGTCG ACCACTACAA CATCTTCTCC A6CATCTACT TCCTAGCCGT GATGAGCGTG 420 

GACCGATACC TGGTGGTGCT GGCCACCGTG AGGTCCCGCC ACATGCCCTG GCGCACCTAC 480 

CGGGGGGCGA AGGTCGCCAG CCTGTGTGTC TGGCTGGGCG TCACGGTCCT GGTTCTGCCC 540 

TTCTTCTCTT TCGCTGGCGT CTACAGCAAC GAGCTGCAGG TCCCAAGCTG TGGGCTGAGC 600 

TTCCCGTGGC CCGAGCGGGT CTGGTTCAAG GCCAGCCGTG TCTACACTTT GGTCCTGGGC 660 

TTCGTGCTGC CCGTGTGCAC CATCTGTGTG CTCTACACAG ACCTCCTGCG CAGGCTGCGG 720 
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GCCGTGCGGC TCCGCTCTGG AGCCAAGGCT CTAGGCAAGG CCAGGCGGAA GGTGACCGTC 780 
CTGGTCCTCG TCGTGCTGGC CGTGTGCCTC CTCTGCTCGA CGCCCTTCCA CCTGGCCTCT 840 
GTCGTGGCCC TGACCACGGA CCTGCCCCAG ACCCCACTGG TCATCAGTAT GTCCTACX3TC 900 
ATCACCAGCC TCACGTACGC CAACTCGTGC CTGAACCCCT TCCTCTACGC CTTTCTAGAT 960 
GACAACTTCC GGAAGAACTT CCGCAGCATA TTGCGGTGCT GA io02 
(17) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE C3IARACTERISTICS : 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECDLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Gin Ala Ala Gly His Pro Glu Pro Leu Asp Ser Arg Gly Ser Phe 
1 5 10 15 

Ser Leu Pro Thr Met Gly Ala Asn Val Ser Gin Asp Asn Gly Thr Glv 
20 -- ^ 



25 



30 



His Asn Ala Thr Phe Ser Glu Pro Leu Pro Phe Leu Tyr Val Leu Leu 
35 40 45 

Pro Ala Val Tyr Ser Gly He Cys Ala Val Gly Leu Thr Gly Asn Thr 
S° 55 60 

Ala val He Leu Val He Leu Arg Ala Pro Lys Met Lys Thr Val Thr 
65 70 75 



80 



Asn val Phe He Leu Asn. Leu Ala Val Ala Asp Gly Leu Phe Thr Leu 
85 90 95 

val Leu Pro Val Asn He Ala Glu His Leu Leu Gin Tyr Trp Pro Phe 

105 • 

Gly Glu Leu Leu Cys Lys Leu Val Leu Ala Val Asp His Tyr Asn He 
115 120 125 

Phe ser Ser He Tyr Phe Leu Ala Val Met Ser Val Asp Arg Tyr Leu 
130 135 140 

val val Leu Ala Thr Val Arg Ser Arg His Met Pro Trp Arg Thr Tyr 

155 160 

Arg Gly Ala Lys Val Ala Ser Leu Cys Val Trp Leu Gly Val Thr Val 
165 170 175 
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Leu Val Leu Pro Phe Phe Ser Phe Ala Gly Val Tyr Ser Asn Glu Leu 
180 185 190 

Gin Val Pro Ser Cys Gly Leu Ser Phe Pro Trp Pro Glu Arg Val Trp 
195 200 205 

Phe Lys Ala Ser Arg Val Tyr Thr Leu Val Leu Gly Phe Val Leu Pro 
210 215 220 

Val Cys Thr He Cys Val Leu Tyr Thr Asp Leu Leu Arg Arg Leu Arg 
225 230 235 240 

Ala Val Arg Leu Arg Ser Gly Ala Lys Ala Leu Gly Lys Ala Arg Arg 
245 250 255 

Lys Val Thr Val Leu Val Leu Val Val Leu Ala Val Cys Leu Leu Cys 
260 265 270 

Trp Thr Pro Phe His Leu Ala Ser Val Val Ala Leu Thr Thr Asp Leu 
275 280 285 

Pro Gin Thr Pro Leu Val He Ser Met Ser Tyr Val He Thr Ser Leu 
290 295 300 

Thr Tyr Ala Asn Ser Cys Leu Asn Pro Phe Leu Tyr Ala Phe Leu Asp 
305 310 315 320 

Asp Asn Phe Arg Lys Asn Phe Arg Ser He Leu Arg Cys 
325 330 

(18) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQtTENCE DESCRIPTION: SEQ ID NO: 17: 
ACGAATTCAG CCATGGTCCT TGAGGTGAGT GACCACCAAG TGCTAAAT 48 

(19) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
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GAGGATCCTG GAATGCGGGG AAGTCAG 2 7 

(20) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1107 base pairs 

5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
10 ATGGTCCTTG AGGTGAGTGA CCACCAAGTG CTAAATGACG CCGAGGTTGC CGCCCTCCTG 
GAGAACTTCA GCTCTTCCTA TGACTATGGA GAAAACGAGA GTGACTCGTG CTGTACGTCC 
CCGCCCTGCC CACAGGACTT CAGCCTGAAC TTCGACCGGG . CCTTCCTGCC AGCCCTCTAC 
AGCCTCCTCT TTCTGCTGGG GCTGCTGGGC AACGGCGCGG TGGCAGCCGT GCTGCTGAGC 
CGGCGGACAG CCCTGAGCAG CACCGACACC TTCCTGCTCC ACCTAGCTGT AGCAGACACG 
15 CTGCTGGTGC TGACACTGCC GCTCTGGGCA GTGGACGCTG CCGTCCAGTG GGTCTTTGGC 

TCTGGCCTCT GCAAAGTGGC AGGTGCCCTC TTCAACATCA ACTTCTACGC AGGAGCCCTC 420 
CTGCTGGCCT GCATCAGCTT TGACCGCTAC CTGAACATAG TTCATGCCAC CCAGCTCTAC 
CGCCGGGGGC CCCCGGCCCG CGTGACCCTC ACCTGCCTGG CTGTCTGGGG GCTCTGCCTG 
CTTTTCGCCC TCCCAGACTT CATCTTCCTG TCGGCCCACC ACGACGAGCG CCTCAACGCC 600 
20 ACCCACTGCC AATACAACTT CCCACAGGTG GGCCGCACGG CTCTGCGGGT GCTGCAGCTG ^ 660 
GTGGCTGGCT TTCTGCTGCC CCTGCTGGTC ATGGCCTACT GCTATGCCCA CATCCTGGCC 
GTGCTGCTGG TTTCCAGGGG CCAGCGGCGC CTGCGGGCCA TGCGGCTGGT GGTGGTGGTC 
GTGGTGGCCT TTGCCCTCTG CTGGACCCCC TATCACCTGG TGGTGCTGGT GGACATCCTC 
ATGGACCTGG GCGCTTTGGC CCGCAACTGT GGCCGAGAAA GCAGGGTAGA CGTGGCCAAG 900 
25 TC6GTCACCT CAGGCCTGGG CTACATGCAC TGCTGCCTCA ACCCGCTGCT CTATGCCTTT 960 
GTAGGGGTCA AGTTCCGGGA GCGGATGTGG ATGCTGCTCT TGCGCCTGGG CTGCCCCAAC 1020 
CAGAGAGGGC TCCAGAGGCA GCCATCGTCT TCCCGCCGGG ATTCATCCTG GTCTGAGACC 1080 
TCAGAGGCCT CCTACTCGGG CTTGTGA 

1107 

(21) INFORMATION FOR SEQ ID NO: 20: 



60 
120 
180 

240 
300 
360 



480 
540 



720 
780 
840 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 368 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

5 (D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2iO: 

Met Val Leu Glu Val Ser Asp His Gin Val Leu Asn Asp Ala Glu Val 
15 10 15 

10 Ala Ala Leu Leu Glu Asn Phe Ser Ser Ser Tyr Asp Tyr Gly Glu Asn 

20 25 30 

Glu Ser Asp Ser Cys Cys Thr Ser Pro Pro Cys Pro Gin Asp Phe Ser 
35 40 45 

Leu Asn Phe Asp Arg Ala Phe Leu Pro Ala Leu Tyr Ser Leu Leu Phe 
15 50 55 60 

Leu Leu Gly Leu Leu Gly Asn Gly Ala Val Ala Ala Val Leu Leu Ser 
65 70 . 75 80 

Arg Arg Thr Ala Leu Ser Ser Thr Asp Thr Phe Leu Leu His Leu Ala 
85 90 95 

20 Val Ala Asp Thr Leu Leu Val Leu Thr Leu Pro Leu Trp Ala Val Asp 

100 105 110 

Ala Ala. Val Gin Trp Val Phe Gly Ser Gly Leu Cys Lys Val Ala Gly 
115 120 125 

Ala Leu Phe Asn He Asn Phe Tyr Ala Gly Ala Leu Leu Leu Ala Cys 
25 130 135 140 

He Ser Phe Asp Arg Tyr Leu Asn He Val His Ala Thr Gin Leu Tyr 
145 150 155 160 

Arg Arg Gly Pro Pro Ala Arg Val Thr Leu Thr Cys Leu Ala Val Trp 
165 170 175 

30 Gly Leu Cys Leu Leu Phe Ala Leu Pro Asp Phe He Phe Leu Ser Ala 

180 185 190 

His His Asp Glu Arg Leu Asn Ala Thr His Cys Gin Tyr Asn Phe Pro 
195 200 205 

Gin Val Gly Arg Thr Ala Leu Arg Val Leu Gin Leu Val Ala Gly Phe 
35 210 215 220 

Leu Leu Pro Leu Leu Val Met Ala Tyr Cys Tyr Ala His He Leu Ala 
225 230 235 240 
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Val Leu Leu Val Ser Arg Gly Gin Arg Arg Leu Arg Ala Met Arg Leu 
245 250 255 

Val Val Val Val Val Val Ala Phe Ala Leu Cys Trp Thr Pro Tyr His 
260 265 270 

Leu Val Val Leu Val Asp He Leu Met Asp Leu Gly Ala Leu Ala Arg 
275 280 285 

Asn Cys Gly Arg Glu Ser Arg Val Asp Val Ala Lys Ser Val Thr Ser 
290 295 300 

Gly Leu Gly Tyr Met His Cys Cys Leu Asn Pro Leu Leu Tyr Ala Phe 
305 310 315 320 

Val Gly Val Lys Phe Arg Glu Arg Met Trp Met Leu Leu Leu Arg Leu 
325 330 335 

Gly Cys Pro Asn Gin Arg Gly Leu Gin Arg Gin Pro Ser Ser Ser Arg 
340 345 350 

Arg Asp Ser Ser Trp Ser Glu Thr Ser Glu Ala Ser Tyr Ser Gly Leu 
355 360 365 

(22) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:.21: 
TTAAGCTTGA CCTAATGCCA TCTTGTGTCC 30 

(23) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TTGGATCCAA AAGAACCATG CACCTCAGAG 30 

(24) INFORMATION FOR SEQ ID NO: 23: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1074 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SfiQ ID NO: 23: 

ATGGCTGATG ACTATGGCTC TGAATCCACA TCTTCCATGG AAGACTACGT TAACTTCAAC 60 

TTCACTGACT TCTACTGTGA GAAAAACAAT GTCAGGCAGT TTGCGAGCCA TTTCCTCCCA 120 

CCCTTGTACT GGCTCGTGTT CATCGTGGGT GCCTTGGGCA ACAGTCTTGT TATCCTTGTC 180 

TACTGGTACT GCACAAGAGT GAAGACCATG ACCGACATGT TCCTTTTGAA TTTGGCAATT 240 

GCTGACCTCC TCTTTCTTGT CACTCTTCCC TTCTGGGCCA TTGCTGCTGC TGACCAGTGG 300 

AAGTTCCAGA CCTTCATGTG CAAGGTGGTC AACAGCATGT ACAAGATGAA CTTCTACAGC 360 

TGTGTGTTGC TGATCATGTG CATCAGCGTG GACAGGTACA TTGCCATTGC CCAGGCCATG 420 

AGAGCACATA CTTGGAGGGA GAAAAGGCTT TTGTACAGCA AAATGGTTTG CTTTACCATC 480 

TGGGTATTGG CAGCTGCTCT CTGCATCCCA GAAATCTTAT ACAGCCAAAT CAAGGAGGAA 540 

TCCGGCATTG CTATCTGCAC CATGGTTTAC CCTAGCGATG AGAGCACCAA ACTGAAGTCA 600 

GCTGTCTTGA CCCTGAAGGT CATTCTGGGG TTCTTCCTTC CCTTCGTGGT CATGGCTTGC 660 

TGCTATACCA TCATCATTCA CACCCTGATA CAAGCCAAGA AGTCTTCCAA GCACAAAGCC 720 

CTAAAAGTGA CCATCACTGT CCTGACCGTC TTTGTCTTGT CTCAGTTTCC CTACAACTGC 780 

ATTTTGTTGG TGCAGACCAT TGACGCCTAT GCCATGTTCA TCTCCAACTG TGCCGTTTCC 840 

ACCAACATTG ACATCTGCTT CCAGGTCACC CAGACCATCG CCTTCTTCCA CAGTTGCCTG 900 

AACCCTGTTC TCTATGTTTT TGTGGGTGAG AGATTCCGCC GGGATCTCGT GAAAACCCTG 960 

AAGAACTTGG GTTGCATCAG CCAGGCCCAG TGGGTTTCAT TTACAAGGAG AGAGGGTUiGC 1020 

TTGAAGCTGT CGTCTATGTT GCTGGAGACA ACCTCAGGAG CACTCTCCCT CTGA 1074 
(25) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Ala Asp Asp Tyr Gly Ser Glu Ser Thr Ser Ser Met Glu Asp Tyr 
1 5 10 15 

Val Asn Phe Asn Phe Thr Asp Phe Tyr Cys Glu Lys Asn Asn Val Arq 
20 25 30 

Gin Phe Ala Ser His Phe Leu Pro Pro Leu Tyr Trp Leu Val Phe lie 
35 40 45 

Val Gly Ala Leu Gly Asn Ser Leu Val He Leu Val Tyr Trp Tyr Cys 
50 55 60 

Thr Arg Val Lys Thr Met Thr Asp Met Phe Leu Leu Asn Leu Ala He 

, 70 75 . 80 , 

Ala Asp Leu Leu Phe Leu Val Thr Leu Pro Phe Trp Ala He Ala Ala 
85 90 95 

Ala Asp Gin Trp Lys Phe Gin Thr Phe Met Cys Lys Val Val Asn Ser 
100 105 

Met Tyr Lys Met Asn Phe Tyr Ser Cys Val Leu Leu He Met Cys He 
115 120 125 

Ser Val Asp Arg Tyr He Ala He Ala Gin Ala Met Arg Ala His Thr 
130 135 140 

Trp Arg Glu Lys Arg Leu Leu Tyr Ser Lys Met Val Cys Phe Thr He 

150 155 160 

Trp Val Leu Ala Ala Ala Leu Cys He Pro Glu He Leu Tyr Ser Gin 
165 170 

He Lys Glu Glu Ser Gly He Ala He Cys Thr Met Val Tyr Pro Ser 
180 185 190 

Asp Glu Ser Thr Lys Leu Lys Ser Ala' Val Leu Thr Leu Lys Val He 
195 200 205 

Leu Gly Phe Phe Leu Pro Phe Val Val Met Ala Cys Cys Tyr Thr He 
210 215 220 

He He His Thr Leu He Gin Ala Lys Lys Ser Ser Lys His Lys Ala 
2" 230 235 240 

Leu Lys Val Thr He Thr Val Leu Thr Val Phe Val Leu Ser Gin Phe 
245 250 255 

Pro Tyr Asn Cys He Leu Leu Val Gin Thr He Asp Ala Tyr Ala Met 
260 265 270 

Phe He Ser Asn Cys Ala Val Ser Thr Asn He Asp He Cys Phe Gin 
275 280 285 
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Val Thr 
290 



Gin Thr He Ala Phe Phe His Ser Cys Leu Asn Pro Val Leu 
295 300 



Tyr Val 
305 



Phe Val Gly Glu Arg Phe Arg Arg Asp Leu Val Lys Thr Leu 
310 315 320 



Lys Asn 



Leu Gly Cys lie Ser Gin Ala Gin Trp Val Ser Phe Thr Arg 
325 330 335 



Arg Glu 



Gly Ser Leu Lys Leu Ser Ser Met Leu Leu Glu Thr Thr Ser 
340 345 350 



Gly Ala 



Leu Ser Leu 
355 



(26) INFORMATION FOR SEQ ID NO: 25: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1110 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

ATGGCCTCAT CGACCACTCG GGGCCCCAGG GTTTCTGACT TATTTTCTGG GCTGCCGCCG €0 

GCGGTCACAA CTCCCGCCAA CCAGAGCGCA GAGGCCTCGG CGGGCAACGG GTCGGTGGCT 120 

GGCGCGGACG CTCCAGCCGT CACGCCCTTC CAGAGCCTGC AGCTGGTGCA TCAGCTGAAG 180 

GGGCTGATCG TGCTGCTCTA CAGCGTCGTG GTGGTCGTGG GGCTGGTGGG CAACTGCCTG 240 

CTGGTGCTGG TGATCGCGCG GGTGCCGCGG CTGCACAACG TGACGAACTT CCTCATCGGC 300 

AACCTGGCCT TGTCCGACGT GCTCATGTGC ACCGCCTGCG TGCCGCTCAC GCTGGCCTAT 360 

GCCTTCGAGC CACGCGGCTG GGTGTTCGGC GGCGGCCTGT GCCACCTGGT CTTCTTCCTG 420 

CAGCCGGTCA CCGTCTATGT GTCGGTGTTC ACGCTCACCA CCATCGCAGT GGACCGCTAC 480 

GTCGTGCTGG TGCACCCGCT GAGGCGCGCA TCTCGCTGCG CCTCAGCCTA CGCTGTGCTG 540 

GCCATCTGGG CGCTGTCCGC GGTGCTGGCG CTGCCGCCCG CCGTGCACAC CTATCACGTG 600 

GAGCTCAAGC CGCACGACGT GCGCCTCTGC GAGGAGTTCT GGGGCTCCCA GGAGCGCCAG 660 

CGCCAGCTCT ACGCCTGGGG GCTGCTGCTG GTCACCTACC TGCTCCCTCT GCTGGTCATC 720 

CTCCTGTCTT ACGTCCGGGT GTCAGTGAAG CTCCGCAACC GCGTGGTGCC GGGCTGCGTG 780 

ACCCAGAGCC AGGCCGACTG GGACCGCGCT CGGCGCCGGC GCACCTTCTG CTTGCTGGTG 840 
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GTGGTCGTGG TGGTGTTCGC CGTCTGCTGG CTGCCGCTGC ACGTCTTCAA CCTGCTGCGG 900 
GACCTCGACC CCCACGCCAT CGACCCTTAC GCCTTTGGGC TGGTGCAGCT GCTCTGCCAC 960 
TGGCTC6CCA TGAGTTCGGC CTGCTApiAC CCCTTCATCT ACGCCTGGCT GCACGACAGC 1020 

TTCCGCGAGG AGCTGCGCAA ACTGTTGGTC GCTTGGCCCC GCAAGATAGC CCCCCATGGC 1080 

r 

CAGAATATGA CCGTC3USCGT GGTCATCTGA X\.\Q 
(27) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 369 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
Met Ala Ser Ser Thr Thr Arg Gly Pro Arg Val Ser Asp Leu Phe Ser 

Gly Leu Pro Pro Ala Val Thr Thr Pro Ala Asn Gin Ser Ala Glu Ala 
20 25 30 

Ser Ala Gly Asn Gly Ser Val Ala Gly Ala Asp Ala Pro Ala Val Thr 
35 40 45 

Pro Phe Gin Ser Leu Gin Leu Val His Gin Leu Lys Gly Leu He Val 
50 55 60 

Leu Leu Tyr Ser Val Val Val Val Val Gly Leu Val Gly Asn Cys Leu 
" 70 75 80 

Leu Val Leu Val He Ala Arg Val Pro Arg Leu His Asn Val Thr Asn 
85 90 95 

Phe Leu He Gly Asn Leu- Ala Leu Ser Asp Val Leu Met Cys Thr Ala 
"0 105 110 

Cys Val Pro Leu Thr Leu Ala Tyr Ala Phe Glu Pro Arg Gly Trp Val 
115 120 125 

Phe Gly Gly Gly Leu Cys His Leu Val Phe Phe Leu Gin Pro Val Thr 
130 135 

Val Tyr Val Ser Val Phe Thr Leu Thr Thr He Ala Val Asp Arg Tyr 

150 155 160 

Val Val Leu Val His Pro Leu Arg Arg Ala Ser Arg Cys Ala Ser Ala 
165 170 175 
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Tyr Ala Val Leu Ala lie Trp Ala Leu Ser Ala Val Leu Ala Leu Pro 
180 185 190 

Pro Ala Val His Thr Tyr His Val Glu Leu Lys Pro His Asp Val Arg 
195 200 205 

5 Leu Cys Glu Glu Phe Trp Gly Ser Gin Glu Arg Gin Arg Gin Leu Tyr 

210 215 220 

Ala Trp Gly Leu Leu Leu Val Thr Tyr Leu Leu Pro Leu Leu Val lie 
225 230 235 240 

Leu .Leu Ser Tyr Val Arg Val Ser Val Lys Leu Arg Asn Arg Val Val 
10 245 250 255 

Pro Gly Cys Val Thr Gin Ser Gin Ala Asp Trp Asp Arg Ala Arg Arg 
260 265 270 

Arg Arg Thr Phe Cys Leu Leu Val Val Val Val Val Val Phe Ala Val 
275 280 285 

15 Cys Trp Leu Pro Leu His Val Phe Asn Leu Leu Arg Asp Leu Asp Pro 

290 295 300 

His Ala lie Asp Pro Tyr Ala Phe Gly Leu Val Gin Leu Leu Cys His 
305 310 315 320 

Trp Leu Ala Met Ser Ser Ala Cys Tyr Asn Pro Phe lie Tyr Ala Trp 
20 325 330 335 

Leu His Asp Ser Phe Arg Glu Glu Leu Arg Lys Leu Leu Val Ala Trp 
340 345 350 

Pro Arg Lys lie Ala Pro His Gly Gin Asn Met Thr Val Ser Val Val 
355 360 365 

25 He 



(28) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1083 base pairs 

30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27:. 
35 ATGGACCCAG AAGAAACTTC AGTTTATTTG GATTATTACT ATGCTACGAG CCC7VAACTCT 60 



GACATCAGGG AGACCCACTC CCATGTTCCT TACACCTCTG TCTTCCTTCC AGTCTTTTAC 120 
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ACAGCTGTGT TCCTGACTGG AGTGCTGGGG AACCTTGTTC TCATGGGAGC GTTGCATTTC 180 

AAACCCGGCA GCCGAAGACT GATCGACATC TTTATCATCA ATCTGGCTGC CTCTGACTTC 240 

ATTTTTCTTG TCACATTGCC TCTCTGGGTG GATAAAGAAG CATCTCTAGG ACTGTGGAGG 300 

ACGGGCTCCT TCCTGTGCAA AGGGAGCTCC TACATGATCT CCGTCAATAT GCACTGCAGT 360 

GTCCTCCTGC TCACTTGCAT GAGTGTTGAC CGCTACCTGG CCATTGTGTG GCCAGTCGTA 420 

TCCAGGAAAT TCAGAAGGAC AGACTGTGCA TATGTAGTCT GTGCCAGCAT CTGGTTTATC 480 

TCCTGCCTGC. TGGGGTTGCC TACTCTTCTG TCCAGGGAGC TCACGCTGAT TGATGATAAG 540 

CCATACTGTG CAGAGAAAAA GGCAACTCCA ATTAAACTCA TATGGTCCCT GGTGGCCTTA 600 

ATTTTCACCT TTTTTGTCCC TTTGTTGAGC ATTGTGACCT GCTACTGTTG CATTGCAAGG 660 

AAGCTGTGTG CCCATTACCA GCAATCAGGA AAGCACAACA AAAAGCTGAA GAAATCTATA 720 

AAGATCATCT TTATTGTCGT GGCAGCCTTT CTTGTCTCCT GGCTGCCCTT GAATACTTTC 780 

AAGTTCCTGG CCATTGTCTC TGGGTTGCGG CAAGAACACT ATTTACCCTC AGCTATTCTT 840 

CAGCTTGGTA TGGAGGTGAG TGGACCCTTG GCATTTGCCA ACAGCTGTGT CAACCCTTTC 900 

ATTTACTATA TCTTCGACAG CTACATCCGC CGGGCCATTG TCCACTGCTT GTGCCCTTGC 960 

CTGAAAAACT ATGACTTTGG GAGTAGCACT GAGACATCAG ATAGTCACCT CACTAAGGCT 1020 

CTCTCCACCT TCATTCATGC AGAAGATTTT GCCAGGAGGA GGAAGAGGTC TGTGTCACTC 1080 

~ 1083 
(29) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 360 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Met Asp Pro Glu Glu Thr Ser Val Tyr Leu J\sp Tyr Tyr Tyr Ala Thr 
1 5 10 15 . 

Ser Pro Asn Ser Asp lie Arg Glu Thr His Ser His Val .Pro Tyr Thr 
20 25 30 

Ser Val Phe Leu Pro Val Phe Tyr Thr Ala Val Phe Leu Thr Gly Val 
35 40 45 
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Leu Gly Asn Leu Val Leu Met Gly Ala Leu His Phe Lys Pro Gly Ser 
50 55 60 

Arg Arg Leu lie Asp lie Phe lie lie Asn Leu Ala Ala Ser Asp Phe 
65 70 75 80 

5 lie Phe Leu Val Thr Leu Pro Leu Trp Val Asp Lys Glu Ala Ser Leu 

85 90 95 

Gly Leu Trp Arg Thr Gly Ser Phe Leu Cys Lys Gly Ser Ser Tyr Met 
100 105 110 

lie Ser Val Asn Met His Cys Ser Val Leu Leu Leu Thr Cys Met Ser 
10 115 120 125 

Val Asp Arg Tyr Leu Ala He Val Trp Pro Val Val Ser Arg Lys Phe 
130 135 140 

Arg Arg Thr Asp Cys Ala Tyr Val Val Cys Ala Ser He Trp Phe He 
145 150 155 160 

15 Ser Cys Leu Leu Gly Leu Pro Thr Leu Leu Ser Arg Glu Leu Thr Leu 

165 170 175 

He Asp Asp Lys Pro Tyr Cys Ala Glu Lys Lys Ala Thr Pro He .Lys 
180 185 190 

Leu He Trp Ser Leu Val Ala Leu He Phe Thr Phe Phe Val Pro Leu 
20 195 200 205 

Leu Ser He Val Thr Cys Tyr Cys Cys He Ala Arg Lys Leu Cys Ala 
210 215 220 

His Tyr Gin Gin Ser Gly Lys His Asn Lys Lys Leu Lys Lys Ser He 
225 230 235 240 

25 Lys He He Phe He Val Val Ala Ala Phe Leu Val Ser Trp Leu Pro 

245 250 255 

Phe Asn Thr Phe Lys Phe Leu Ala He Val Ser Gly Leu Arg Gin Glu 
260 265 270 

His Tyr Leu Pro Ser Ala He Leu Gin Leu Gly Met Glu Val Ser Gly 
30 275 280 285 

Pro Leu Ala Phe Ala Asn Ser Cys Val Asn Pro Phe He Tyr Tyr He 
290 295 300 

Phe Asp Ser Tyr He Arg Arg Ala He Val His Cys Leu Cys Pro Cys 
305 310 315 320 

35 Leu Lys Asn Tyr Asp Phe Gly Ser Ser Thr Glu Thr Ser Asp Ser His 

325 330 335 



Leu Thr Lys Ala Leu Ser Thr Phe He His Ala Glu Asp Phe Ala Arg 
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340 345 350 

Arg Arg Lys Arg Ser Val Ser Leu 
355 360 

(30) INFORMATION FOR SEQ ID NO:29: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
CTAGAATTCT GACTCCAGCC AAAGCATGAA T 31 

(31) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
GCTGGATCCT AAACAGTCTG CGCTCGGCCT 30 

(32) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1020 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

ATGAATGGCC TTGAAGTGGC TCCCCCAGGT CTGATCACCA ACTTCTCCCT GGCCACGGCA 60 

GAGCAATGTG GCCAGQAGAC GCCACTGGAG AACATGCTGT TCGCCTCCTT CTACCTTCTG 120 

GATTTTATCC TGGCTTTAGT TGGCAATACC CTGGCTCTGT GGCTTTTCAT CCGAGACCAC 180 

AAGTCCGGGA CCCCGGCCAA CGTGTTCCTG ATGGATCTGG CCGTGGCCGA CTTGTCGTGC 240 



GTGCTGGTCC TGCCCACCCG CCTGGTCTAC CACTTCTCTG GGAACCACTG GCCATTTGGG 300 
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GAAATCGCAT GCCGTCTCAC CGGCTTCCTC TTCTACCTCA ACATGTACGC CAGCATCTAC 360 

TTCCTCACCT GCATCAGCGC C6ACCGTTTC CTGGCCATTG TGCACCCGGT CAAGTCCCTC 420 

AAGCTCCGCA GGCCCCTCTA CGCACACCTG GCCTGTGCCT TCCTGTGGGT GGTGGTGGCT 480 

GTGGCCATGG CCCCGCTGCT GGTGAGCCCA CAGACCGTGC AGACCAACCA CACGGTGGTC 540 

TGCCTGCAGC TGTACCGGGA GAAGGCCTCC CACCATGCCC TGGTGTCCCT GGCAGTGGCC 600 

TTCACCTTCC CGTTCATCAC CACGGTCACC TGCTACCTGC TGATCATCCG CAGCCTGCGG 660 

CAGGGCCTGC GTGTGGAGAA GCGCCTCAAG ACCAAGGCAG TGCGCATGAT CGCCATAGTG 720 

CTGGCCATCT TCCTGGTCTG CTTCGTGCCC TACCACGTCA ACCGCTCCGT CTACGTGCTG 78 0 

CACTACCGCA GCCATGGGGC CTCCTGCGCC ACCCAGCGCA TCCTGGCCCT GGCAAACCGC 840 

ATCACCTCCT GCCTCACCAG CCTCAACGGG GCACTCGACC CCATCATGTA TTTCTTCGTG .900 

GCTGAGAAGT TCCGCCACGC CCTGTGCAAC TTGCTCTGTG GCAAAAGGCT CAAGGGCCCG 960 

CCCCCCAGCT TCGAAGGGAA AACCAACGAG AGCTCGCTGA GTGCCAAGTC AGAGCTGTGA 1020 
(33) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 339 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE -TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Met Asn Gly Leu Glu Val Ala Pro Pro Gly Leu lie Thr Asn Phe Ser 
15. 10 15 

Leu Ala Thr Ala Glu Gin Cys Gly Gin Glu Thr Pro Leu Glu Asn Met 
20 25 30 

Leu Phe Ala Ser Phe Tyr Leu Leu Asp Phe lie Leu Ala Leu Val Gly 
35 40 45 

Asn Thr Leu Ala Leu Trp Leu Phe lie Arg Asp His Lys Ser Gly Thr 
50 55 60 

Pro Ala Asn Val Phe Leu Met His Leu Ala Val Ala Asp Leu Ser Cys 
65 70 75 80 

Val Leu Val Leu Pro Thr Arg Leu Val Tyr His Phe Ser Gly Asn His 
85 90 95 

Trp Pro Phe Gly Glu lie Ala Cys Arg Leu Thr Gly Phe Leu Phe Tyr 
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105 110 
Leu Asn Met Tyr Ala Ser He Tyr Phe Leu Thr Cys He Ser Ala Asp 



115 



120 



125 



Arg Phe Leu Ala He Val His Pro Val Lys Ser Leu Lys Leu Ara Ara 
130 135 140 

Pro Leu Tyr Ala His Leu Ala Cys Ala Phe Leu Trp Val Val Val Ala 

150 155 160 

Val Ala Met Ala Pro Leu Leu Val Ser Pro Gin Thr Val Gin Thr Asn 
1€5 170 

His Thr Val Val Cys Leu Gin Leu Tyr Arg Glu Lys Ala Ser His His 
180 185 190 

Ala Leu Val Ser Leu Ala Val Ala Phe Thr Phe Pro Phe He Thr Thr 
195 200 205 

Val Thr Cys Tyr Leu Leu He He Arg Ser Leu Arg Gin Gly Leu Arg 
210 215 220 

Val Glu Lys Arg Leu Lys Thr Lys Ala Val Arg Met He Ala He Val 

230 235 240 

Leu Ala He Phe Leu Val Cys Phe Val Pro Tyr His Val Asn Arg Ser 
245 250 255 

Val Tyr Val Leu His Tyr Arg Ser His Gly Ala Ser Cys Ala Thr Gin 
260 265 270 

Arg He Leu Ala Leu Ala Asn Arg He Thr Ser Cys Leu Thr Ser Leu 
275 280 285 

Asn Gly Ala Leu Asp Pro He Met Tyr Phe Phe Val Ala Glu Lys Phe 
290 295 300 

Arg His Ala Leu Cys Asn Leu Leu Cys Gly Lys Arg Leu Lys Gly Pro 

310 315 320 

Pro Pro Ser Phe Glu Gly Lys Thr Asn Glu Ser Ser Leu Ser Ala Lys 
325 330 

Ser Glu Leu 



(34) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
ATAAGATGAT CACCCTGAAC AATCAAGAT 29 

(35) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs- 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
TCCGAATTCA TAACATTTCA CTGTTTATAT TGC 33 

(36) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 996 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

ATGATCACCC TGAACAATCA AGATCAACCT GTCACTTTTA ACAGCTCACA TCCAGATGAA 60 

TACAAAATTG CAGCCCTTGT CTTCTATAGC TGTATCTTCA TAATTGGATT ATTTGTTAAC 120 

ATCACTGCAT TATGGGTTTT CAGTTGTACC ACCAAGAAGA GAACCACGGT AACCATCTAT 180 

ATGATGAATG TGGCATTAGT GGACTTGATA TTTATAATGA CTTTACCCTT TCGAATGTTT 240 

TATTATGCAA AAGATGCATG GCCATTTGGA GAGTACTTCT GCCAGATTAT TGGAGCTCTC 300 

ACAGTGTTTT ACCCAAGCAT TGCTTTATGG CTTCTT6CCT TTATTAGTGC TGACAGATAC 360 

ATGGCCATTG TACAGCCGAA GTACGCCAAA GAACTTAAAA ACACGTGCAA AGCCGTGCTG 420 

GCGTGTGTGG GAGTCTGGAT AATGACCCTG ACCACGACCA CCCCTCTGCT ACTGCTCTAT 480 

AAAGACCCAG ATAAAGACTC CACTCCCGCC ACCTGCCTCA AGATTTCTGA CATCATCTAT 540 

CTAAAAGCTG TGAACGTGCT GAACCTCACT CGACTGACAT TTTTTTTCTT GATTCCTTTG 600 

TTCATCATGA TTGGGTGCTA CTTGGTCATT ATTCATAATC TCCTTCACGG CAGGACGTCT 660 

AAGCTGAAAC CCAAAGTCAA GGAGAAGTCC ATAAGGATCA TCATCACGCT GCTGGTGCAG 720 
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GTGCTCGTCT GCTTTATGCC CTTCCACATC TGTTTCGCTT TCCT6ATGCT GGGAACGGGG 780 
GAGAACAGTT ACAATCCCTG GGGAGCCTTT . ACCACCTTCC TCATGAACCT CAGCACGTGT 840 
CTGGATGTGA TTCTCTACTA CATCGTTTCA AAACAATTTC AGGCTCGAGT CATTAGTGTC 900 
ATGCTATACC GTAATTACCT TCGAAGCCTG CGCAGAAAAA GTTTCCQATC TGGTAGTCTA 960 
AGGTCACTAA GCAATATAAA CAGTGAAATG TTATGA 

996 

(37) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 331 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein • 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Met He Thr Leu Asn Asn Gin Asp Gin Pro Val Thr Phe Asn Ser Ser 
^5 10 



15 



His Pro Asp Glu Tyr Lys He Ala Ala Leu Val Phe Tyr Ser Cys He 
20 25 30 

Phe He He Gly Leu Phe Val Asn He Thr Ala Leu Trp Val Phe Ser 
35 40 45 

Cys Thr Thr Lys Lys Arg Thr Thr Val Thr He Tyr Met Met Asn Val 
^° 55' 60 

Ala Leu Val Asp Leu He Phe He Met Thr Leu Pro Phe Arg Met Phe 

65 — 



70 75 



80 



Tyr Tyr Ala Lys Asp Ala Trp Pro Phe Gly Glu Tyr Phe Cys Gin lie 
85 90 95 

He Gly Ala Leu Thr Val Phe Tyr Pro Ser He Ala Leu Trp Leu Leu 
100 105 

Ala Phe He Ser Ala Asp Arg Tyr Met Ala He Val Gin Pro Lys Tyr 
"5 120 125 

Ala Lys Glu Leu Lys Asn Thr Cys Lys Ala Val Leu Ala Cys Val Gly 
130 135 140 

val Trp lie Met Thr Leu Thr Thr Thr Thr Pro Leu Leu Leu Leu Tyr 

"0 155 ^0 

Lys Asp Pro Asp Lys Asp Ser Thr Pro Ala Thr Cys Leu Lys He Ser 
1S5 170 3^75 
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Asp lie lie Tyr Leu Lys Ala Val Asn Val Leu Asn Leu Thr Arg Leu 

180 185 190 

Thr Phe Phe Phe Leu lie Pro Leu Phe lie Met lie Gly Cys Tyr Leu 
195 200 205 



Val He He His Asn Leu Leu His Gly Arg Thr Ser Lys Leu Lys Pro 
210 215 220 

Lys Val Lys Glu Lys Ser lie Arg He He He Thr Leu Leu Val Gin 
225 230 235 240 

Val Leu Val Cys Phe Met Pro Phe His He Cys Phe Ala Phe Leu Met 
245 250 255 

Leu Gly Thr Gly Glu Asn Ser Tyr Asn Pro Trp Gly Ala Phe Thr Thr 
260 265 270 

Phe Leu Met Asn Leu Ser Thr Cys Leu Asp Val He Leu Tyr Tyr He 
275 280 285 

Val Ser Lys Gin Phe Gin Ala Arg Val He Ser Val Met Leu Tyr Arg 
290 295 300 

Asn Tyr Leu Arg Ser Leu Arg Arg Lys Ser Phe Arg Ser Gly Ser Leu 
305 310 315 320 



Arg Ser Leu Ser Asn He Asn Ser Glu Met Leu 
325 330 



(38) INFORMATION FOR SEQ ID NO:37: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
CCAAGCTTCC AGGCCTGGGG TGTGCTGG 
(39) INFORMATION FOR SEQ ID NO: 38: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 
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29 



ATGGATCCTG ACCTTCGGCC CCTGGCAGA 
(40) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1077 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: KNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

ATGCCCTCT6 TGTCTCCAGC GGGGCCCTCG GCCGGGGCAG TCCCCAATGC CACCGCAGT6 60 

ACAACAGTGC GGACCAATGC CAGCGGGCTG GAGGTGCCCC TGTTCCACCT GTTTGCCCGG 120 

CTGGACGAGG AGCTGCATGG CACCTTCCCA G6CCTGTGCG TGGCGCTGAT GGCGGTGCAC 180 

QGAGCCATCT TCCTGGCAGG GCTGGTGCTC AACGGGCTGG CGCTGTACGT CTTCTGCTGC 240 

CGCACCCGGG CCAAGACACC CTCAGTCATC TACACCATCA ACCTGGTGGT GACCGATCTA 300 

CTGGTAGGGC TGTCCCTGCC CACGCGCTTC GCTGTGTACT ACGGCGCCAO GGGCTGCCTQ 360 

CGCTGTGCCT TCCCGCACGT CCTCGGTTAC TTCCTCAACA TGCACTGCTC CATCCTCTTC 420 

CTCACCTGCA TCTGCGTGGA CCGCTACCTG GCCATCGTGC GGCCCGAAGG CTCCCGCCGC 480 

TGCCGCCAGC CTGCCTGTGC CAGGGCCGTG TGCGCCTTCG TGTGGCTGGC CGCCGGTGCC 540 

GTCACCCTGT CGGTGCTGGG CGTGACAGGC AGCCGGCCCT GCTGCCGTGT CTTTGCGCTG 600 

ACTGTCCTGG AGTTCCTGCT GCCCCTGCTG GTCATCAGCG TGTTTACCGG CCGCATCATG 660 

TGTGCACTGT CGCGGCCGGG TCTGCTCCAC CAGGGTCGCC AGCGCCGCGT GCGGGCCATG 720 

CAGCTCCTGC TCACGGTGCT CATCATCTTT CTCGTCTGCT TCACGCCCTT CCACGCCCGC 780 

CAAGTGGCCG TGGCGCT6TG GCCCQAC3VTG CCACACCACA CSA6CCTCGT QGTCTACCAC 840 

GTGGCCGTGA CCCTCAGCAG CCTCAACAGC T6CATGGACC CCATCGTCTA CTGCTTCGTC 900 

ACCAGTGGCT TCCAGGCCAC CGTCCGAGGC CTCTTCGGCC AGCACGGAGA GCGTGAGCCC 960 

AGCAGCGGTG ACGTGGTCAG CATGCACAGG AGCTCCAAGG GCTCAGGCCG TCATCACATC 1020 

CTCAGTGCC6 GCCCTCACGC CCTCACCCAG GCCCTGGCTA ATCGGCCCGA GGCTTAG 1077 
(41) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 358 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



Met Pro Ser Val Ser Pro Ala Gly Pro Ser Ala Gly Ala Val Pro Asn 
1 5 10 15 

Ala Thr Ala Val Thr Thr Val Arg Thr Asn Ala Ser Gly Leu Glu Val 
20 25 30 

Pro Leu Phe His Leu Phe Ala Arg Leu Asp Glu Glu Leu His Gly Thr 
35 40 45 

Phe Pro Gly Leu Cys Val Ala Leu Met Ala Val His Gly Ala He Phe 
50 55 60 

Leu Ala Gly Leu Val Leu Asn Gly Leu Ala Leu Tyr Val Phe Cys Cys 
65 70 75 80 

Arg Thr Arg Ala Lys Thr Pro Ser Val He Tyr Thr He Asn Leu Val 
85 90 95 

Val Thr Asp Leu Leu Val Gly Leu Ser Leu Pro Thr Arg Phe Ala Val 
100 105 110 

Tyr Tyr Gly Ala Arg Gly Cys Leu Arg Cys Ala Phe Pro His Val Leu 
115 120 125 

Gly Tyr Phe Leu Asn Met His Cys Ser He Leu Phe Leu Thr Cys He 
130 135 140 

Cys Val Asp Arg Tyr Leu Ala He Val Arg Pro Glu Ala Pro Ala Ala 
145 150 155 160 

Cys Arg Gin Pro Ala Cys Ala Arg Ala Val Cys Ala Phe Val Trp Leu 
165 170 175 

Ala Ala Gly Ala Val Thr Leu Ser Val Leu Gly Val Thr Gly Ser Arg 
180 185 190 

Pro Cys Cys Arg Val Phe Ala Leu Thr Val Leu Glu Phe Leu Leu Pro 
195 200 205 

Leu Leu Val He Ser Val Phe Thr Gly Arg He Met Cys Ala Leu Ser 
210 215 220 

Arg Pro Gly Leu Leu His Gin Gly Arg Gin Arg Arg Val Arg Ala Met 
225 230 235 240 

Gin Leu Leu Leu Thr Val Leu He He Phe Leu Val Cys Phe Thr Pro 
245 250 255 
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Phe His Ala Arg Gin Val Ala Val Ala Leu Trp Pro Asp Met Pro His 
260 265 270 

His Thr Ser Leu val Val Tyr His Val Ala Val Thr Leu Ser Ser Leu 
275 280 285 

Asn ser Cys Met Asp Pro He Val Tyr Cys Phe Val Thr Ser Gly Phe 



300 



Gin Ala Thr Val Arg Gly Leu Phe Gly Gin His Gly Glu Arg Glu Pro 

JUD 



315 



320 



ser ser Gly Asp Val Val Ser Met His Arg Ser Ser Lys Gly Ser Gly 
^""^ 330 

Arg His His lie Leu Ser Ala Gly Pro His Ala Leu Thr Gin Ala Leu 
340 



Ala Asn Gly Pro Glu Ala 
355 

15 . (42) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41; 
GAGAATTCAC TCCTGAGCTC AAGATGAACT 
(43) INFORMATION FOR SEQ ID NO: 42: 

2^ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
CGGGATCCCC GTAACTGAGC CACTTCAGAT 
(44) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1050 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



30 



35 



350 



30 



30 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIFTION: SEQ ID NO:43: 

ATGAACTCCA CCTTGGATGG TAATCAGAGC AGCCACCCTT TTTGCCTCTT GGCATTTGGC 60 

TATTTGGAAA CTGTCAATTT TTGCCTTTTG GAAGTATTGA TTATTGTCTT TCTAACTGTA 120 

TTGATTATTT CTGGCAACAT CATTGTGATT TTTGTATTTC ACTGTGCACC TTTGTTGAAC 180 

CATCACACTA CAAGTTATTT TATCCAGACT ATGGCATATG CTGACCTTTT TGTTGGGGTG 240 

AGCTGCGTGG TCCCTTCTTT ATCACTCCTC CATCACCCCC TTCCAGTAGA GGAGTCCTTG 300 

ACTTGCCAGA TATTTGGTTT TGTAGTATCA GTTCTGAAGA GCGTCTCCAT GGCTTCTCTG 360 

GCCTGTATCA GCATTGATAG ATACATTGCC ATTACTAAAC CTTTAACCTA TAATACTCTG 420 

GTTACACCCT GGAGACTACG CCTGTGTATT TTCCTGATTT GGCTATACTC GACCCTGGTC 480 

TTCCTGCCTT CCTTTTTCCA CTGGG6CAAA CCTGGATATC ATGGAGATGT GTTTCAGTGG 540 

TGTGCGGAGT CCTGGCACAC CGACTCCTAC TTCACCCTGT TCATCGTGAT GATGTTATAT 600 

GCCCCAGCAG CCCTTATTGT CTGCTTCACC TATTTCAACA TCTTCCGCAT CTGCCAACAG 660 

CACACAAAGG ATATCAGCGA AAGGCAAGCC CGCTTCAGCA GCCAGAGTGG GGAGACTGGG 720 

GAAGTGCAGG CCTGTCCTGA TAAGCGCTAT GCCATGGTCC TGTTTCGAAT CACTAGTGTA 780 

TTTTACATCC TCTGGTTGCC ATATATCATC TACTTCTTGT TGGAAAGCTC CACTGGCCAC 840 

AGCAACCGCT TCGCATCCTT CTTGACCACC TGGCTTGCTA TTAGTAACAG TTTCTGCAAC 900 

TGTGTAATTT ATAGTCTCTC CAACAGTGTA TTCCAAAGAG GACTAAAGCG CCTCTCAGGG 960 

GCTATGTGTA CTTCTTGTGC AAGTCAGACT ACAGCCAACG ACCCTTACAC AGTTAGTUIGC 1020 

AAAGGCCCTC TTAATGGATG TCATATCTGA 1050 
(45) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 349 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
Met Asn Ser Thr Leu Asp Gly Asn Gin Ser Ser His Pro Phe Cys Leu 



wo 00/22129 

PCT/US99/23938 

35 

Leu Ala Phe Gly Tyr Leu Glu Thr Val Asn Phe Cys Leu Leu Glu Val 
Leu lie lie Val Phe Leu Thr Val Leu lie He Ser Gly Asn He He 



45 



val lie Phe Val Phe His Cys Ala Pro Leu Leu Asn His Hi 

55 60 



His Thr Thr 

ser ryr Phe He Gin Thr Met Ala Tyr Ala Asp Leu Phe Val Gly Val 

75 80 
ser cys Val Val Pro ser Leu Ser Leu Leu His His Pro Leu Pro Val 



80 

o ser Leu Ser Leu Leu His His pt-o t.^„ 

85 

Glu Glu Ser Leu Thr Cys Gin He Phe Gly Phe Val Val 



90 35 



100 



105 



Ser Val Leu 
110 



Lys ser Val Ser Met Ala Ser Leu Ala Cys He Ser He Asp Arg Tyr 

120 125 

lie Ala He Thr Lys Pro' Leu Thr ryr Asn Thr Leu Val Thr Pro Txp 



"5 140 



Arg Leu Arg Leu Cys He Phe Leu He Trp Leu Tyr Ser Thr Leu Val 



155 160 



Phe Leu Pro Ser Phe Phe His Trp Gly Lys Pro Gly Tyr His Gly Asp 
1" 170 

val Phe Gin ^ Cys Ala Glu Ser Trp His Thr Asp Ser Tyr Phe Thr 
"° "5 ISO 

Leu Phe He Val Met Met Leu T^r Ala Pro Ala Ala Leu He Val Cys 

200 205 

Phe Thr Tyr Phe Asn He Phe Arg He Cys Gin Gin His Thr Lys Asp ' 

220 

lie ser Glu Arg Gin Ala Arg Phe Ser Ser Gin Ser Gly Glu Thr Gly 

"° 235 240 

Glu val Gin Ala Cys Pro Asp Lys Arg Tyr Ala Met Val Leu Phe Arg 
245 250 255 

lie Thr ser Val Phe Tyr He Leu Trp Leu Pro Tyr He He lyr .Phe 

265 270 

Leu Leu Glu ser Ser Thr Gly His Ser Asn Arg Phe Ala Ser Phe Leu 

285 

Thr Thr Trp Leu Ala He Ser Asn Ser Phe Cys Asn Cys Val He Tyr 



295 300 
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Ser Leu Ser Asn Ser Val Phe Gin Arg Gly Leu Lys Arg Leu Ser Gly 
305 310 315 320 

Ala Met Cys Thr Ser Cys.Ala Ser Gin Thr Thr Ala Asn Asp Pro Tyr 
325 330 335 

5 Thr Val Arg Ser Lys Gly Pro Leu Asn Gly Cys His He 

340 345 

(46) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 base pairs 

10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 5 45: 



15 TCCCCCGGGA AAAAAACCAA CTGCTCCAAA 

(47) INFORMATION FOR SEQ ID NO: 46: 



20 



25 (48) INFORMATION FOR SEQ ID NO: 47: 



30 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
TAGGATCCAT TTGAATGTGG ATTTGGTGAA A 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1302 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
ATGTGTTTTT CTCCCATTCT GGAAATCAAC ATGCAGTCTG AATCTAACAT TACAGTGCGA 60 
GATGACATTG ATGACATCAA CACCAATATG TACCAACCAC TATCATATCC GTTAAGCTTT .120 



35 CAAGTGTCTC TCACCGGATT TCTTATGTTA GAAATTGTGT TGGGACTTGG CAGCAACCTC 180 
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ACTGTATTGG TACTTTACTG CATGAAATCC AACTTAATCA ACTCTGTCAG TAACATTATT 240 
ACAATGAATC TTCATGTACT TGATGTAATA ATTTGTGTGG GATCTATTCC TCTAACTATA 300 
GTTATCCTTC TGCTTTCACT GGAGAGTAAC ACTCCTCTCA rTTGCTGTTT CCATGAGGCT 360 
TGTGTATCTT TTGCAAGTGT CTCAACAGCA ATCAACGTTT TTCCTATCAC TTTCGACAQA 420 
TATGACATCT CTGTAAAACC TGCaAACCGA ATTCTGACAA TOGGCAGAGC TGTAAT6TTA 480 
ATC3ATATCCA TTTGGATTTT TTCTTTTTTC TCTTTCCTGA TTCCTTTTAT TGAGGTAAAT 540 
TTTTTCAGTC TTCAAAGTGG AAATACCTGG GAAAACAAGA CACTTTTATG TGTCAGTACA 600 
AATGAATACT ACACTGAACT GGGAATGTAT TATCACCTGT TAGTACAGAT CCCAATATTC 660 
TTTTTCACTG TTGTA6TAAT GTTAATCACA TACACCAAAA TACTTCAGGC TCTTAATATT 720 
CGAATAGGCA CAAGATTTTC AACAGGGCAG AAGAAGAAAG CAAGAAAGAA AAAGACAATT 780 
TCTCTAACCA CACAACATGA GGCTACAGAC ATGTCACAAA GCAGTGGTGG GAQAAATGTA 840 
GTCTTTOaTG TAAGAACTTC AGTTTCTGTA ATAATTGCCC TCCGGCGAGC TGTGAAACGA 900 
CACCGTOAAC GACGAGAAAG ACAAAAGASA GTCTTCAGGA TGTCTTTATT GATTATTTCT 960 
ACATTTCTTC TCTGCTGGAC ACCAATTTCT GTTTTAAATA CCACCATTTT ATGTTTAGGC 1020 
CCAAGTGACC TTTTAGTAAA ATTAAGATTG TGTTTTTTAG TCATGGCTTA TGGAACAACT 1080 
ATATTTCACC CTCTATTATA TGCATTCACT AGACAAAAAT TTCAAAAGGT CTTGAAAAGT 1140 
AAAATGAAAA AGCGAGTTGT TTCTATAGTA GAAGCTGATC CCCTCCCTAA TAATGCTGTA 1200 
ATACACAACT CTTGGATAGA TCCCAAAAGA AACAAAAAAA TTACCTTTGA AGATAGTGAA 1260 
ATAAGAGAAA AACGTTTAGT GCCTCAGGTr GTCACAGACT AG 1302 
(49) INFORMATION FOR SBQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 433 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Met Cys Phe Ser Pro He Leu Glu He Asn Met Gin Ser Glu Ser Asn 
V 5 10 15 

He Thr Val Arg Asp Asp He Asp Asp He Asn Thr Asn Met Tyr Gin 
20 25 30 
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Pro Leu Ser Tyr Pro Leu Ser Phe Gin Val Ser Leu Thr Gly Phe Leu 
35 40 45 

Met Leu Glu lie Val Leu Gly Leu Gly Ser Asn Leu Thr Val Leu Val 
50 55 60 

5 Leu Tyr Cys Met Lys Ser Asn Leu lie Asn Ser Val Ser Asn lie lie 

65 70 75 80 

Thr Met Asn Leu His Val Leu Asp Val lie lie Cys Val Gly Cys lie 
85 90 95 

Pro Leu Thr He Val He Leu Leu Leu Ser Leu Glu Ser Asn Thr Ala 
10 100 105 110 

Leu He Cys Cys Phe His Glu Ala Cys Val Ser Phe Ala Ser Val Ser 
115 120 125 

Thr Ala He Asn Val Phe Ala He Thr Leu Asp Arg Tyr Asp He Ser 
130 135 140 

15 Val Lys Pro Ala Asn Arg He Leu Thr Met Gly Arg Ala Val Met Leu 

145 150 155 160 

Met He Ser He Trp He Phe Ser Phe Phe Ser Phe Leu He Pro Phe 
165 170 175 

.He Glu Val Asn Phe Phe Ser Leu Gin Ser Gly Asn Thr Trp Glu Asn 
20 180 185 190 

Lys Thr Leu Leu Cys Val Ser Thr Asn Glu Tyr Tyr Thr Glu Leu Gly 
195 200 205 

Met Tyr Tyr His Leu Leu Val Gin He Pro He Phe Phe Phe Thr Val 
210 215 220 

25 Val Val Met Leu He Thr Tyr Thr Lys He Leu Gin Ala Leu Asn He 

225 230 235 240 

Arg He Gly Thr Arg Phe Ser Thr Gly Gin Lys Lys Lys Ala Arg Lys 
245 250 255 

Lys Lys Thr He Ser Leu Thr Thr Gin His Glu Ala Thr Asp Met Ser 
30 260 265 270 

Gin Ser Ser Gly Gly Arg Asn Val Val Phe Gly Val Arg Thr Ser Val 
275 280 285 

Ser Val He He Ala Leu Arg Arg Ala Val Lys Arg His Arg Glu Arg 
290 295 300 

35 Arg Glu Arg Gin Lys Arg Val Phe Arg Met Ser Leu Leu He He Ser 

305 310 315 320 



Thr Phe Leu Leu Cys Trp Thr Pro He Ser Val. Leu Asn Thr Thr He 
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325 



330 



335 

Leu cys Leu Gly Pro Ser Asp Leu Leu Val Lys Leu Arg Leu Cys Phe 

350 



10 



Leu val Met Ala Tyr Gly Thr «^ He Phe His Pro Leu Leu Tyr Ala 

360 265 

Phe Arg Oln Lys Phe 01. Lys Val Leu Lys Ser Lys Met Lys Lys 

380 

Arg val Val Ser He Val Clu Ala Asp Pro Leu Pro Asn Ash Ala Val 

400 

lie His Asn Ser Trp He Asp Pro Lys Arg Asn Lys Lys He ^ Phe 

410 

Glu Asp ser Glu He Arg Glu Lys Arg- Leu Val Pro Gin Val val Thr 



425 



430 



15 



20 



Asp 



(50) INFORMATION FOR SEQ ID NO: 49; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: 
6TGAAGCTTGCCTCTGGTGC CTGCAGGAGG 
25 (51) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
GCAGAATTCC CG6TGGCGTG TTGTGGTGCC C 
(52) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1209 base pairs 



30 



31 
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(B) TYPE: nucleic acid 

(C) STRAMDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

ATGTTGTGTC CTTCCAAGAC AGATGGCTCA GGGCACTCTG GTAGGATTCA CCAGGAAACT 60 

CATGGAGAAG GGAAAAGGGA CAAGATTAGC AACAGTGAAG GGAGGGAGAA TGGTGGGAGA 120 

GGATTCCAGA TGAACGGTGG GTCGCTGGAG GCTGAGCATG CCAGCAGGAT GTCAGTTCTC 180 

AGAGCAAAGC CCATGTCAAA CAGCCAACGC TTGCTCCTTC TGTCCCCAGG ATCACCTCCT 240 

CGCACGGGGA GCATCTCCTA CATCAACATC ATCATGCCTT CGGTGTTCGG CACCATCTGC 300 

CTCCTGGGCA TCATCGGGAA CTCCACGGTC ATCTTCGCGG TCGTGAAGAA GTCCAAGCTG 360 

CACTGGTGCA ACAACGTCCC CGACATCTTC ATCATCAACC TCTCGGTAGT AGATCTCCTC 420 

TTTCTCCTGG GCATGCCCTT CATGATCCAC CAGCTCATGG GCAATGGGGT GTGGCACTTT 480 

GGGGAQACCA TGTGCACCCT CATCACGGCC ATGGATGCCA ATAGTCAGTT CACCAGCACC 540 

TACATCCTGA CCGCCATGGC CATTGACCGC TACCTGGCCA CTGTCCACCC CATCTCTTCC .600 

ACGAAGTTCC GGAAGCCCTC TGTGGCCACC CTGGTGATCT GCCTCCTGTG GGCCCTCTCC 660 

TTCATCAGCA TCACCCCTGT GTGGCTGTAT GCCAGACTCA TCCCCTTCCC AGGAGGTGCA 720 

GTGGGCTGCG GCATACGCCT GCCCAACCCA GACACTGACC TCTACTGGTT CACCCTGTAC 780 

CAGTTTTTCC TGGCCTTTGC CCTGCCTTTT GTGGTCATCA CAGCCGCATA CGTGAGGATC 840 

CTGCAGCGCA TGACGTCCTC AGTGGCCCCC GCCTCCCAGC GCAGCATCCG GCTGCGGACA 900 

AAGAGGGTGA CCCGCACAGC CATCGCCATC TGTCTGGTCT TCTTTGTGTG CTGGGCACCC 960 

TACTATGTGC TACAGCTGAC CCAGTTGTCC ATCAGCCGCC CGACCCTCAC CTTTGTCTAC 1020 

TTATACAATG CGGCCATCAG CTTGGGCTAT GCCAACAGCT GCCTCAACCC CTTTGTGTAC 1080 

ATCGTGCTCT GTGAGACGTT CCGCAAACGC TTGGTCCTGT CGGTGAAGCC TGCAGCCCAG 1140 

GGGCAGCTTC GCGCTGTCAG CAACGCTCAG ACGGCTGACG AGGAGAGGAC AGAAAGCAAA 1200 

GGCACCTGA 1209 
(53) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 402 amino acids 

(B) TYPE: amino acid 



WOOQ/22129 



PCT/US99/23938 



41 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECOLE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
Met Leu Cys Pro Ser Lys Thr Asp Gly Ser Gly His Ser Gly Arc He 

His Gin Glu Thr His Gly Glu Gly Lys Arg Asp Lys He Ser Asn Ser 



20 25 



30 



Glu Gly Arg Glu Asn Gly Gly Arg Gly Phe Gin Met Asn Gly Gly Ser 
35 40 45 

Leu Glu Ala Glu His Ala Ser Arg Met Ser Val Leu Arg Ala Lys Pro 

S5 60 

Met Ser.Asn Ser Gin Arg Leu Leu Leu Leu Ser Pro Gly Ser Pro Pro 

75 80 

Arg Thr Gly Ser He Ser Tyr lie Asn He lie Met Pro Ser Val Phe 
85 90 95 

Gly Thr He Cys Leu Leu Gly He He Gly Asn Ser Thr Val He Phe 
100 105 

Ala Val val Lys Lys Ser Lys Leu His Trp Cys Asn Asn Val Pro Asp 
115 120 125 

He Phe He He Asn Leu Ser Val Val Asp Leu Leu Phe Leu Leu Gly 

135 140 

Met Pro Phe Met He His Gin Leu Met Gly Asn Gly Val Trp His Phe 
^ 155 160 

Gly Glu Thr Met Cys Thr. Leu He Thr Ala Met Asp Ala Asn Ser Gin 
165 170 175 

Phe Thr ser Thr Tyr He Leu Thr Ala Met Ala He Asp Arg Tyr 
180 185 190 

Ala Thr val His Pro He Ser Ser Thr Lys Phe Arg Lys Pro 



175 

Leu 



195 200 



Ser Val 



205 



Ala Thr Leu Val He Cys Leu Leu Trp Ala Leu Ser Phe He Ser He 
210 215 220 

Thr Pro val Trp Leu Tyr Ala Arg Leu He Pro Phe Pro Gly Gly Ala 



235 240 



Val Gly Cys Gly He Arg Leu Pro Asn Pro Asp Thr Asp Leu Tyr Trp 



250 



255 
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Phe Thr Leu Tyr Gin Phe Phe Leu Ala Phe Ala Leu Pro Phe Val Val 
260 265 270 

lie Thr Ala Ala Tyr Val Arg lie Leu Gin Arg Met Thr Ser Ser Val 
275 280 255 

5 Ala Pro Ala Ser Gin Arg Ser lie Arg Leu Arg Thr Lys Arg Val Thr 

290 295 300 

Arg Thr Ala lie Ala lie Cys Leu Val Phe Phe Val Cys Trp Ala Pro 
305 310 315 320 

Tyr Tyr Val Leu Gin Leu Thr Gin Leu Ser He Ser Arg Pro Thr Leu 
10 325 330 335 

Thr Phe Val Tyr Leu Tyr Asn Ala Ala He Ser Leu Gly Tyr Ala Asn 
340 345 350 

Ser Cys Leu Asn Pro Phe Val Tyr He Val Leu Cys Glu Thr Phe Arg 
355 360 365 

15 Lys Arg Leu Val Leu Ser Val Lys Pro Ala Ala Gin Gly Gin Leu Arg 

370 375 380 

Ala Val Ser Asn Ala Gin Thr Ala Asp Glu Glu Arg Thr Glu Ser Lys 
385 390 395 400 



20 



Gly Thr 

(54) INFORMATION FOR SEQ ID NO: 53: 



(i) SEQUENCE CHT^CTERISTICS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
GGCGGATCCA TGGATGTGAC TTCCCAA 27 
30 (55) INFORMATION FOR SEQ ID NO:54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
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GGCGGATCCC TAC3VC6GCAC TGCTGAA 
(56) INFORMATION FOR SEQ ID NO:55: 



27 



(i) SEQOENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

ATGGATGTGA CTTCCCAAGC CCGOGQCGTQ 6GCCTGGAGA TGTACCCAGG CACCGCGCAC 60 

GCTGCGGCCC CCAACACCAC CTCCCCOGAG CTCAACCTGT CCCACCCGCT CCTGGGCACC 120 

GCCCTGGCCA ATGGGACAG6 T6AGCTCTCG GAGCACCAGC AGTACGTGAT CGGCCTGTTC 180 

CTCTCGTGCC TCTACACCAT CTTCCTCTTC CCCATCGGCT TTGTGGGCAA CATCCTGATC 240 

CTGGTGGTGA ACATCAGCTT CCGCGAGAA6 ATGACCATCC CCGACCTGTA CTTCATCAAC 300 

CTGGCGGTGG CGGACCTCAT CCTGGTGGCC GACTCCCTCA TTGAGGTGTT CAACCTGCAC 360 

GAGCGGTACT ACGACATCGC CGTCCTGTGC ACCTTCATGT CGCTCTTCCT GCAGGTCAAC 420 

ATGTACAGCA GCGTCTTCTT CCTCACCTGG ATGAGCTTCG ACCGCTACAT CGCCCTGGCC 480 

AGOGCCATGC GCTGCAGCCT 6TTCCGCACC AAGCACCACG CCCGGCTGAG CTGTGGCCTC 540 

ATCTGGATGG CATCCGTGTC AGCCACGCTG GTGCCCTTCA CCGCCGTGCA CCTGCAGCAC 600 ' 

ACCGACGAGG CCTGCTTCTG TTTCGCGGAT GTCCGGGAGG TGCAGTGGCT CGAGGTCACG 660 

CTGGGCTTCA TCGTGCCCTT CGCCATCATC GGCCTGTGCT ACTCCCTCAT TGTCCGGGTG 720 

CT66TCAGGG CGCACCGGCA CCGTGGGCTG CGGCCCC6GC GGCAGAAGGC GCTCCGCATG 780 

ATCCTCGCGG TGGTGCTGGT CTTCTTCGTC TGCTGGCTGC CGGAGAACGT CTTCATCAGC 840 

GTGCACCTCC TGCAGCGGAC GCAGCCTGGG GCCGCTCCCT GCAAGCAGTC TTTCCGCCAT 900 

GCCCACCCCC TCACGGGCCA CATTQTCAAC CTCGCCGCCT TCTCCAACAG CT6CCTAAAC 960 

CCCCTCATCT ACAGCTTTCT CGGG6AGACC TTCAGGGACA AGCTGAGGCT 6TACATTGAG 1020 

CAGAAAACAA ATTTGCCGGC CCTGAACCGC TTCTGTCACG CTGCCCTGAA GGCCGTCATT 1080 

CCAGACAGCA CCGAGCAGTC GGATGTGAGG TTCAGCAGTG CCGTGTGA 1128 
(57) INFORMATION FOR SEQ ID NO: 56: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Met Asp Val Thr Ser Gin Ala Arg Gly Val Gly Leu Glu Met Tyr Pro 
15 10 15 

Gly Thr Ala His Ala Ala Ala Pro Asn Thr Thr Ser Pro Glu Leu Asn 
20 25 30 

Leu Ser His Pro Leu Leu Gly Thr Ala Leu Ala Asn Gly Thr Gly Glu 
35 40 45 

Leu Ser Glu His Gin Gin Tyr Val lie Gly Leu Phe Leu Ser Cys Leu 
50 55 60 

Tyr Thr lie Phe Leu Phe Pro lie Gly Phe Val Gly Asn lie Leu lie 
65 70 75 80 

Leu Val Val Asn lie Ser Phe Arg Glu Lys Met Thr lie Pro Asp Leu 
85 90 95 

Tyr Phe lie Asn Leu Ala Val Ala Asp Leu lie Leu Val Ala Asp Ser 
100 105 110 

Leu He Glu Val Phe Asn Leu His Glu Arg Tyr Tyr Asp He Ala Val 
115 120 125 

Leu Cys Thr Phe Met Ser Leu Phe Leu Gin Val Asn Met Tyr Ser Ser 
130 135 140 

Val Phe Phe Leu Thr Trp Met Ser Phe Asp Arg Tyr He Ala Leu Ala 
145 150 155 160 

Arg Ala Met Arg Cys Ser Leu Phe Arg Thr Lys His His Ala Arg Leu 
165 170 175 

Ser Cys Gly Leu He Trp Met Ala Ser Val Ser Ala Thr Leu Val Pro 
180 185 190 

Phe Thr Ala Val His Leu Gin His Thr Asp Glu Ala Cys Phe Cys Phe 
195 200 205 

Ala Asp Val Arg Glu Val Gin Trp Leu Glu Val Thr Leu Gly Phe He 
210 215 220 

Val Pro Phe Ala He He Gly Leu Cys Tyr Ser Leu He Val Arg Val 
225 230 235 240 



Leu Val Arg Ala His Arg His Arg Gly Leu Arg Pro Arg Arg Gin Lys 



10 



15 



0 
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' 245 250 255 

Ala Leu Arg Met He Leu Ala Val Val Leu Val Phe Phe Val Cys Trp 
260 265 



270 



Leu Pro Glu Asn Val Phe He Ser Val His Leu Leu Gin Arg Thr Gin 
275 280 285 

Pro Gly Ala Ala Pro Cys Lys Gin Ser Phe Arg His Ala His Pro Leu 

295 

Thr Gly His He Val Asn Leu Ala Ala Phe Ser Asn Ser Cys Leu Asn 

315 320 

Pro Leu lie Tyr Ser Phe Leu Gly Glu Thr Phe Arg Asp Lys Leu Arg 
325 330 

Leu Tyr lie Glu Gin Lys Thr Asn Leu Pro Ala Leu Asn Arg Phe Cys 

345 350 

His Ala Ala Leu Lys Ala Val He Pro Asp Ser Thr Glu Gin Ser Asp 
355 360 365 

Val Arg Phe Ser Ser Ala Val 
370 375 

(58) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 
- (A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:57: 

AAGGAATTCA CGGCCGGGTG ATGCCATTCC C 
(59) INFORMATION FOR SEQ ID NO: 58: 



31 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
GGTGGATCCA TAAACACGGG CGTTGAGGAC 
(60) INFORMATION FOR SEQ ID NO: 59: 



30 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 960 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

ATGCCATTCC CAAACTGCTC AGCCCCCAGC ACTGTGGTG6 CCACAGCTGT GGGTGTCTTG 60 

CTGGGGCTGG AGTGTGGGCT GGGTCTGCTG GGCAACGCGG TGGCGCTGTG GACCTTCCTG 120 

10 TTCCGGGTCA GGGTGTGGAA GCCGTACGCT GTCTACCTGC TCAACCTGGC CCTGGCTGAC 180 

CTGCTGTTGG CTGCGTGCCT GCCTTTCCTG GCCGCCTTCT ACCTGAGCCT CCAGGCTTGG 240 

CATCTGGGCC GTGTGGGCTG CTGGGCCCTG CGCTTCCTGC TGGACCTCAG CCGCAGCGTG 300 

GGGATGGCCT TCCTGGCCGC CGTGGCTTTG GACCGGTACC TCCGTGTGGT CCACCCTCG6 360 

CTTAAGGTCA ACCTGCTGTC TCCTCAGGCG GCCCTGGGGG TCTCGGGCCT CGTCTGGCTC 420 

15 CTGATGGTCG CCCTCACCTG CCCGGGCTTG CTCATCTCTG AGGCCGCCCA GAACTCCACC 480 

AGGTGCCACA GTTTCTACTC CAGGGCAGAC GGCTCCTTCA GCATCATCTG GCAGGAAGCA 540 

CTCTCCTGCC TTCAGTTTGT CCTCCCCTTT GGCCTCATCG TGTTCTGCAA TGCAGGCATC 600 

ATCAGGGCTC TCCAGAAAAG ACTCCGGGAG CCTGAGAAAC AGCCCAAGCT TCAGCGGGCC 660 

CAGGCACTGG TCACCTTGGT GGTGGTGCTG TTTGCTCTGT GCTTTCTGCC CTGCTTCCTG 720 

20 GCCAGAGTCC TGATGCACAT CTTCCAGAAT CTGGGGAGCT GCAGGGCCCT TTGTGCAGTG 780 

GCTCATACCT CGGATGTCAC GGGCAGCCTC ACCTACCTGC ACAGTGTCGT CAACCCCGTG 840 

• GTATACTGCT TCTCCAGCCC CACCTTCAG6 AGCTCCTATC GGAGGGTCTT CCACACCCTC 900 

CGAGGCAAAG GGCAGGCAGC AGAGCCCCCA GATTTCAACC CCAGAGACTC CTATTCCTGA 960 
(61) INFORMATION FOR SEQ ID NO: 60: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 319 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
Met Pro Phe Pro Asn Cys Ser Ala Pro Ser Thr Val Val Ala Thr Ala 
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^ 5 10 15 

Val Gly Val Leu Leu Gly Leu Glu Cys Gly Leu Gly Leu Leu Gly Asn 
20 25 30 

Ala Val Ala Leu Trp Thr Phe Leu Phe Arg Val Arg Val Trp Lys Pro 
35 40 45 

Tyr Ala Val Tyr Leu Leu Asn Leu Ala Leu Ala Asp Leu Leu Leu Ala 
50 55 60 

Ala Cys Leu Pro Phe Leu Ala Ala Phe Tyr Leu Ser Leu Gin Ala Trp 
€5 70 75 80 

His Leu Gly Arg Val Gly Cys Trp Ala Leu Arg Phe Leu Leu Asp Leu 
85 90 95 

Ser Arg Ser Val Gly Met Ala Phe Leu Ala Ala Val Ala Leu Asp Arg 
100 105 110 

Tyr Leu Arg Val Val His Pro Arg Leu Lys Val - Asn Leu Leu Ser Pro 
115 120 125 

Gin Ala Ala Leu Gly Val Ser Gly Leu Val Trp Leu Leu Met Val Ala 
130 ' 135 140 

Leu Thr Cys Pro Gly Leu Leu He Ser Glu Ala Ala Gin Asn Ser Thr 
145 150 155 160 

Arg Cys His Ser Phe Tyr Ser Arg Ala Asp Gly Ser Phe Ser He He 
165 170 175 

Trp Gin Glu Ala Leu Ser Cys Leu Gin Phe Val Leu Pro Phe Gly Leu 
180 185 190 

He Val Phe Cys Asn Ala Gly He He Arg Ala Leu Gin Lys Arg Leu 
195 200 205 

Arg Glu Pro Glu Lys Gin Pro Lys Leu Gin Arg Ala Gin Ala Leu Val 
210 215 220 

Thr Leu Val Val Val Leu Phe Ala Leu Cys Phe Leu Pro Cys Phe Leu 
225 230 235 240 

Ala Arg Val Leu Met His He Phe Gin Asn Leu Gly Ser Cys Arg Ala 
245 250 255 

Leu Cys Ala Val Ala His Thr Ser Asp Val Thr Gly Ser Leu Thr Tyr 
260 265 270 

Leu His Ser Val Val Asn Pro Val Val Tyr Cys Phe Ser Ser Pro Thr 
275 280 285 

Phe Arg Ser Ser Tyr Arg Arg Val Phe His Thr Leu Arg Gly Lys Glv 
290 295 300 
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Gin Ala Ala Glu Pro Pro Asp Phe Asn Pro Arg Asp Ser Tyr Ser 
305 310 315 

(62) INFORMATION FOR SEQ ID N0:61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1143 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

ATGGAGGAAG GTGGTGATTT TGACAACTAC TATGGGGCAG ACAACCAGTC TGAGTGTGAG 60 

TACACAGACT GGAAATCCTC GGGGGCCCTC ATCCCTGCCA TCTACATGTT GGTCTTCCTC 120 

CTGGGCACCA CGGGAAACGG TCTGGTGCTC TGGACCGTGT TTCGGAGCAG CCGGGAGAAG 180 

AGGCGCTCAG CTGATATCTT CATTGCTAGC CTGGCGGTGG CTGACCTGAC CTTCGTGGTG 240 

ACGCTGCCCC TGTGGGCTAC CTACACGTAC CGGGACTATG ACTGGCCCTT TGGGACCTTC 300 

TTCTGCAAGC TCAGCAGCTA CCTCATCTTC GTCAACATGT ACGCCAGCGT CTTCTGCCTC 360 

ACCGGCCTCA GCTTCGACCG CTACCTGGCC ATCGTGAGGC CAGTGGCCAA TGCTCGGCTG 420 

AGGCTGCGGG TCAGCGGGGC CGTGGCCACG GCAGTTCTTT GGGTGCTGGC CGCCCTCCTG 480 

GCCATGCCTG TCATGGTGTT ACGCACCACC GGGGACTTGG AGAACACCAC TAAGGTGCAG 540 

TGCTACATGG ACTACTCCAT GGTGGCCACT GTGAGCTCAG AGTGGGCCTG GGAGGTGGGC 600 

CTTGGGGTCT CGTCCACCAC CGTGGGCTTT GTGGTGCCCT TCACCATCAT GCTGACCTGT 660 

TACTTCTTCA TCGCCCAAAC CATCGCTGGC CACTTCCGCA AGGAACGCAT CGAGGGCCTG 720 

CGGAAGCGGC GCCGGCTGCT CAGCATCATC GTGGTGCTGG TGGTGACCTT TGCCCTGTGC 780 

TGGATGCCCT ACCACCTGGT GAAGAC6CTG TACATGCTGG GCAGCCTGCT GCACTGGCCC 840 

TGTGACTTTG ACCTCTTCCT CATGAACATC TTCCCCTACT GCACCTGCAT CAGCTACGTC 900 

AACAGCTGCC TCAACCCCTT CCTCTATGCC TTTTTCGACC CCCGCTTCCG CCAGGCCTGC 960 

ACCTCCATGC TCTGCTGTGG CCAGAGCAGG TGCGCAGGCA CCTCCCACAG CAGCAGTGGG 1020 

GAGAAGTCAG CCAGCTACTC TTCGGGGCAC AGCCAGGGGC CCGGCCCCAA CATGGGCAAG 1080 

GGTGGAGAAC AGATGCACGA GAAATGCATC CCCTACAGCC AGGAGACCCT TGTGGTTGAC 1140 

TAG 1143 



wo 00/22129 



PCT/US99/23938 



10 



15 



25 



35 
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(63) INFORMATION FOR SEQ ID N0:62: 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 380 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
Met Glu Glu Gly Gly Asp Phe Asp Asn Tyr Tyr Gly Ala Asp Asn Gin 



15 



Ser Glu Cys Glu Tyr Thr Asp Trp Lys Ser Ser Gly Ala Leu He Pro 
2^ 25 30 

Ala He Tyr Met Leu Val Phe Leu Leu Gly Thr Thr Gly Asn Gly Leu 
35 40 45 

val Leu Trp Thr Val Phe Arg Ser Ser Arg Glu Lys Arg Arg Ser Ala 

55 60 



Asp He Phe lie Ala Ser Leu Ala Val Ala Asp Leu Thr Phe Val Val 
" 75 80 

Thr Leu Pro Leu Trp Ala Thr Tyr Thr Tyr Arg Asp Tyr Asp Trp Pro 

95 



85 90 



Phe Gly Thr Phe Phe Cys Lys Leu Ser Ser Tyr Leu He Phe Val Asn 
100 105 j^^o 

Met Tyr Ala Ser Val Phe Cys Leu Thr Gly Leu Ser Phe Asp Arg Tyr 
115 120 125 

Leu ^a He Val Arg Pro Val Ala Asn Ala Arg Leu Arg Leu Arg Val 

135 140 



ser Gly Ala Val Ala Thr Ala Val Leu Trp Val Leu Ala Ala Leu Leu 

"0 155 160 

Ala Met Pro Val Met Val Leu Arg Thr Thr Gly Asp Leu Glu Asn Thr 
165 170 j^^g 

Thr Lys Val Gin Cys Tyr Met Asp Tyr Ser Met Val Ala Thr Val Ser 



180 185 



190 



Ser Glu Trp Ala Trp Glu Val Gly Leu Gly Val Ser Ser Thr Thr Val 

200 . 205 

Gly Phe val Val Pro Phe Thr He Met Leu Thr Cys Tyr Phe Phe He 
210 215 220 

Ala Gin Thr He Ala Gly His Phe Arg Lys Glu Arg He Glu Gly Leu 



wo 00/22129 



PCT/US99/23938 



50 

225 230 235 240 

Arg Lys Arg Arg Arg Leu Leu Ser He He Val Val Leu Val Vai Thr 
245 250 255 

Phe Ala Leu Cys Trp Met Pro Tyr His Leu Val Lys Thr Leu Tyr Met 
260 265 270 

Leu Gly Ser Leu Leu His Trp Pro Cys Asp Phe Asp Leu Phe Leu Met 
275 280 285 

Asn He Phe Pro Tyr Cys Thr Cys He Ser Tyr Val Asn Ser Cys Leu 
290 295 300 

Asn Pro Phe Leu Tyr Ala Phe Phe Asp Pro Arg Phe Arg Gin Ala Cys 
305 310 315 320 

Thr Ser Met Leu Cys Cys Gly Gin Ser Arg Cys Ala Gly Thr Ser His 
325 330 335 

Ser Ser Ser Gly Glu Lys . Ser Ala Ser Tyr Ser Ser Gly His Ser Gin 
340 345 350 

Gly Pro Gly Pro Asn Met Gly Lys Gly Gly Glu Gin Met His Glu Lys 
355 360 365 

Ser He Pro Tyr Ser Gin Glu Thr Leu Val Val Asp 
370 375 380 

(64) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 



TGAGAATTCT GGTGACTCAC AGCCGGCACA G 
(65) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



31 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
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GCCGGATCCA AGGAAAAGCA GCAATAAAAG G 
(66) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 1119 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

10 ATGAACTACC CGCTAACGCT GGAAATGGAC CTCGAGAACC TGGAGGACCT GTTCTGGGAA 
CTGGACAGAT TGGACAACTA TAACGACACC TCCCTGGTCG AAAATCATCT CTGCCCTCCC 
ACAGAGGGTC CCCTCATGGC CTCCTTCAAG GCCGTGTTCG TGCCCGTGGC CTACAGCCTC 
ATCTTCCTCC TGGGCGTGAT CGGCAACGTC CTGGTGCTGG TGATCCTGGA GCGGCACCGG 
CAGACACGCA GTTCCACGGA GACCTTCCTG TTCCACCTGG CCGTGGCCGA CCTCCT6CTG 
15 GTCTTCATCT TGCCCTITGC CGTGGCCGAG GGCTCTGTGG GCTGGGTCCT GGGGACCTTC 
CTCTGCAAAA CTGTGAITGC CCTGCACAAA GTCAACTTCT ACTGCAGCAG CCTGCTCCTG 
GCCTGCATCG CCGTGGACCG CTACCTGGCC ATTGTCCACG CCGTCCATGC CTACC6CCAC 
CGCCGCCTCC TCTCCATCCA CATCACCTGT GGGACCATCT GGCTGGTGGG . CTTCCTCCTT 
GCCTTGCCAG AGATTCTCTT CGCCAAAGTC AGCCAAGGCC ATCACAACAA CTCCCTGCCA 
20 CGITGCACCT TCTCCCAAGA GAACCAAGCA GAAACGCATG CCTGGTTCAC CTCCCGATTC 
CTCTACCATG TGGCGGGATT CCTGCTGCCC ATGCTGGTGA TGGGCTGGTO CTACGTGGGG 
GTAGTGCACA GGTTGCGCCA GGCCCAGCGG CGCCCTCAGC GGCAGAAGGC AGTCAGGGTG 
GCCATCCTCG TGACAAGCAT CTTCTTCCTC TGCTGGTCAC CCTACCACAT CGTCATCTTC 
CTGGACACCC TGGCGAGGCT GAAGGCCGTG GACAATACCT GCAAGCTGAA TGGCTCTCTC 
25 CCCGTGGCCA TCACCATGTG TGAGTTCCTG GGCCTGGCCC ACTGCTOCCT CAACCCCATC 

CTCTACACTT TCGCCGGCGT GAAGTTCCGC AGTGACCTGT CGCGGCTCCT GACCAAGCTG 1020 
GGCTGTACCG GCCCTGCCTC CCTGTGCCAG CTCrrcCCTA GCTGGCGCAG GAGCAGTCTC 1080 
TCTGAGTCAG AGAATGCCAC CTCTCTCACC ACGTTCTAG * ^^^^ 

(67) INFORMATION FOR SEQ ID NO: 66: 
(i) SEQUENCE CHARACTERISTICS: 



31 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
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(A) LENGTH: 372 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECtJLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: 

Met Asn Tyr Pro Leu Thr Leu Glu Met Asp Leu Glu Asn Leu Glu Asp 
15 10 15 

Leu Phe Trp Glu Leu Asp Arg Leu Asp Asn Tyr Asn Asp Thr Ser Leu 
20 25 30 

Val Glu Asn His Leu Cys Pro Ala Thr Glu Gly Pro Leu Met Ala Ser 
35 40 45 

Phe Lys Ala Val Phe Val Pro Val Ala Tyr Ser Leu He Phe Leu Leu 
50 55 €0 

Gly Val He Gly Asn Val Leu Val Leu Val He Leu Glu Arg His Arg 
65 70 75 80 

Gin Thr Arg Ser Ser Thr Glu Thr Phe Leu Phe His Leu Ala Val Ala 
85 90 95 

Asp Leu Leu Leu Val Phe He Leu Pro Phe Ala Val Ala Glu Gly Ser 
100 105 110 

Val Gly Trp Val Leu Gly Thr Phe Leu Cys Lys Thr Val He Ala- Leu 
115 120 125 

His Lys Val Asn Phe Tyr Cys Ser Ser Leu Leu Leu Ala Cys He Ala 
130 135 140 

Val Asp Arg Tyr Leu Ala He Val His Ala Val His Ala Tyr Arg His 
145 150 155 160 

Arg Arg Leu Leu Ser He His He Thr Cys Gly Thr He Trp Leu Val 
165 170 175 

Gly Phe Leu Leu Ala Leu Pro Glu He Leu Phe Ala Lys Val Ser Gin 
180 185 190 

Gly His His Asn Asn Ser Leu Pro Arg Cys Thr Phe Ser Gin Glu Asn 
195 200 205 

Gin Ala Glu Thr His Ala Trp Phe Thr Ser Arg Phe Leu Tyr His Val 
210 215 220 

Ala Gly Phe Leu Leu Pro Met Leu Val Met Gly Trp Cys Tyr Val Gly 
225 230 235 240 

Val Val His Arg Leu Arg Gin Ala Gin Arg Arg Pro Gin Arg Gin Lys 
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245 250 



255 



10 



15 



Ala yal Arg Val Ala He Leu Val Thr Ser He Phe Phe Leu Cys Trp 
260 265 270 

Ser Pro Tyr His lie Val He Phe Leu Asp Thr Leu Ala Arg Leu Lys 

280 285 

Ala Val Asp Asn Thr Cys Lys Leu Asn Gly Ser Leu Pro Val Ala He 
2^0 295 300 

Thr Met Cys Glu Phe Leu Gly Leu Ala His Cys Cys Leu Asn Pro Met 

310 315 320 

Leu Tyr Thr Phe Ala. Gly Val Lys Phe Arg Ser Asp Leu Ser Arg Leu 
325 330 

Leu Thr Lys Leu Gly Cys Thr Gly Pro Ala Ser Leu Cys Gin Leu Phe 
340 345 

Pro Ser Trp Arg Arg Ser Ser Leu Ser Glu Ser Glu Asn Ala Thr Ser 
355 360 365 

Leu Thr Thr Phe 
370 



(68) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear ' 



(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
CAAAGCTTGA AAGCTGCACG GTGCAGAGAC 

(69) INFORMATION FOR SEQ ID NO:68: 

(i) SEQUENCE CHTU^CTERISTICS : 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
GCGGATCCCG AGTCACACCC TGGCTGGGCC 

(70) INFORMATION FOR SEQ ID NO: 69: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRAIODEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

ATGGATGTGA CTTCCCAAGC CCGGGGCGTG GGCCTGGAGA TGTACCCAGG CACCGCGCAG 60 

CCTGCGGCCC CCAACACCAC CTCCCCCGAG CTCAACCTGT CCCACCCGCT CCTGGGCACC 120 

GCCCTGGCCA ATGGGACAGG TGAGCTCTCG GAGCACCAGC AGTACGTGAT CGGCCTGTTC 180 

CTCTCGTGCC TCTACACCAT CTTCCTCTTC CCCATCGGCT TTGTGGGCAA CATCCTGATC 240 

CTGGTGGTGA ACATCAGCTT CCGCGAGAAG ATGACCATCC CCGACCTGTA CTTCATCAAC 300 

CTGGCGGTGG CGGACCTCAT CCTGGTGGCC GACTCCCTCA TTGAGGTGTT CAACCTGCAC 360 

GAGCGGTACT ACGACATCGC CGTCCTGTGC ACCTTCATGT CGCTCTTCCT GCAGGTCAAC 420 

ATGTACAGCA GCGTCTTCTT CCTCACCTGG ATGAGCTTCG ACCGCTACAT CGCCCTGGCC 480 

AGGGCCATGC GCTGCAGCCT GTTCCGCACC AAGCACCACG CCCGGCTGAG CTGTGGCCTC 540 

ATCTGGATGG CATCCGTGTC AGCCACGCTG GTGCCCTTCA CCGCCGTGCA CCTGCAGCAC 600 

ACCGACGAGG CCTGCTTCTG TTTCGCGGAT GTCCGGGAGG TGCAGTGGCT CGAGGTCACG 660 

CTGGGCTTCA TCGTGCCCTT CGCCATCATC GGCCTGTGCT ACTCCCTCAT TGTCCGGGTG • 720 

CTGGTCAGGG CGCACCGGCA CCGTGGGCTG CGGCCCCGGC GGCAGAAGGC GCTCCGCATG 780 

ATCCTCGCGG TGGTGCTGGT CTTCTTCGTC TGCTGGCTGC CGGAGAACGT CTTCATCAGC 840 

GTGCACCTCC TGCAGCGGAC GCAGCCTGGG GCCGCTCCCT GCAAGCAGTC TTTCCGCCAT 900 

GCCCACCGCC TCACGGGCCA CATTGTCAAC CTCACCGCCT TCTCCAACAG CTGCCTAAAC 960 

CCCCTCATCT ACAGCTTTCT CGGGGAGACC TTCAGGGACA AGCTGAGGCT GTACATTGAG 1020 

CAGAAAACAA ATTTGCCGGC CCTGAACCGC TTCTGTCACG CTGCCCTGT^ GGCCGTCATT 1080 

CCAGACAGCA CCGAGCAGTC GGATGTGAGG TTCAGCAGTG CCGTGTAG 1128 

(71) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 
•(A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 
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(ii) MOLECULE TYPE: protein 

^ {xij SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Met Asp Val Thr Ser Gin Ala Arg Gly Val Gly Leu Glu Met Tyr Pro 
1 5 10 15 

5 Gly Thr Ala Gin Pro Ala Ala Pro Asn Thr Thr Ser Pro Glu Leu Asn 

20 25 30 

Leu Ser His Pro Leu Leu Gly Thr Ala Leu Ala Asn Gly Thr Gly Glu 
'35 40 45 

Leu Ser Glu His Gin Gin Tyr Val He Gly Leu Phe Leu Ser Cys Leu 
10 50 55 60 

Tyr Thr He Phe Leu Phe Pro He Gly Phe Val Gly Asn He Leu He 
65 70 75 80 

Leu Val Val Asn He Ser Phe Arg Glu Lys Met Thr He Pro Asp Leu 
85 90 95 

15 Tyr Phe He Asn Leu Ala Val Ala Asp Leu He Leu Val Ala Asp Ser 

100 105 110 

Leu He Glu Val Phe Asn Leu His Glu Arg Tyr Tyr Asp He Ala Val 
115 120 125 

Leu Cys Thr Phe Met Ser Leu Phe Leu Gin Val Asn Met Tyr Ser Ser 
20 130 135 140 

Val Phe Phe Leu Thr Trp Met Ser Phe Asp Arg Tyr He Ala Leu Ala 
145 150 155 160 

Arg Ala Met Arg Cys Ser Leu Phe Arg Thr Lys His His Ala Arg Leu 
165 170 175 

25 Ser Cys Gly Leu He Trp Met Ala Ser Val Ser Ala Thr Leu Val Pro 

180 185 190 

Phe Thr Ala Val His Leu Gin His Thr Asp Glu Ala Cys Phe Cys Phe 
195 200 205 

Ala Asp Val Arg Glu Val Gin Trp Leu Glu Val Thr Leu Gly Phe He 
30 210 215 220 

Val Pro Phe Ala He He Gly Leu Cys Tyr Ser Leu He Val Arg Val 
225 230 235 240 

Leu Val Arg Ala His Arg His Arg Gly Leu Arg Pro Arg Arg Gin Lys 
245 250 255 



35 



Ala Leu Arg Met He Leu Ala Val Val Leu Val Phe Phe Val Cys Trp 
260 265 270 
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Leu Pro Glu Asn Val Phe lie Ser Val His Leu Leu Gin Arg Thr Gin 
275 280 285 



Pro Gly Ala Ala Pro Cys Lys Gin Ser Phe Arg His Ala His Pro Leu 
290 295 300 



Thr Gly His lie Val Asn Leu Thr 
305 310 

Pro Leu lie Tyr Ser Phe Leu Gly 
325 

Leu Tyr lie Glu Gin Lys Thr Asn 
340 

. His Ala Ala Leu Lys Ala Val He 
355 360 



Ala Phe Ser Asn Ser Cys Leu Asn 
315 320 

Glu Thr Phe Arg Asp Lys Leu Arg 
330 335 

Leu Pro Ala Leu Asn Arg Phe Cys 
345 350 

Pro Asp Ser Thr Glu Gin Ser Asp 
365 



Val Arg Phe Ser Ser Ala Val 
370 375 

(72) INFORMATION FOR SEQ ID NO: 71: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
ACAGAATTCC TGTGTGGTTT TACCGCCCAG 

(73) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

1 

CTCGGATCCA GGCAGAAGAG TCGCCTATCG 

(74) INFORMATION FOR SEQ ID NO: 73: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1137 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

ATGGACCTGG GGAAACCAAT GAAAAGCGTG CTGGTGGTGG CTCTCCTTGT CATTTTCCAG 60 

GTATGCCTGT GTCAAGATGA GGTCACGGAC GATTACATCG GAGACAACAC CACAGTGGAC 120 

TACACTTTGT TCGAGTCTTT GTGCTCCAAG AAGGACGTGC GGAACTTTAA AGCCTGGTTC 180 

CTCCCTATCA TGTACTCCAT CATTTGTTTC GTGGGCCTAC TGGGCAATGG GCTGGTCGTG 240 

TTGACCTATA TCTATTTCAA GAGGCTCAAG ACCATGACCG ATACCTACCT GCTCAACCTG 300 

GCGGTGGCAG ACATCCTCTT CCTCCTGACC CTTCCCTTCT GGGCCTACAG CGCGGCCAAG 360 

TCCTGGGTCT TCGGTGTCCA CTTTTGCAAG CTCATCTTTG CCATCTACAA GATGAGCTTC 420 

TTCAGTGGCA TGCTCCTACT TCTTTGCATC AGCATTGACC GCTACGTGGC CATCGTCCAG 480 

GCTGTCTCAG CTCACCGCCA CCGTGCCCGC GTCCTTCTCA TCAGCAAGCT GTCCTGTGTG 540 

GGCATCTGGA TACTAGCCAC AGTGCTCTCC ATCCCAGAGC TCCTGTACAG TGACCTCCAG 600 

AGGAGCAGCA GTGAGCAAGC GATGCGATGC TCTCTCATCA CAGAGCATGT GGAGGCCTTT 660 

ATCACCATCC AGGTGGCCCA GATGGTGATC GGCTTTCTGG TCCCCCTGCT GGCCATGAGC 720 

TTCTGTTACC TTGTCATCAT CCGCACCCTG CTCCAGGCAC GCAACTTTGA GCGCAACAAG 780 

GCCATCAAGG TGATCATCGC TGTGGTCGTG GTCTTCATAG TCTTCCAGCT GCCCTACAAT 840 

GGGGTGGTCC TGGCCCAGAC GGTGGCCAAC TTCAACATCA CCAGTAGCAC CTGTGAGCTC 900 

AGTAAGCAAC TCAACATCGC CTACGACGTC ACCTACAGCC TGGCCTGCGT CCGCTGCTGC 960 

GTCAACCCTT TCTTGTACGC CTTCATCGGC GTCAA6TTCC GCAACGATCT CTTCAAGCTC 1020 

TTCAAGGACC TGGGCTGCCT CAGCCAGGAG CAGCTCCGGC AGTGGTCTTC CTGTCGGCAC 1080 

ATCCGGCGCT CCTCCATGAG TGTGGAGGCC GAGACCACCA CCACCTTCTC CCCATAG 1137 
(75) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 378 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
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Met Asp Leu Gly Lys Pro Met Lys Ser Val Leu Val Val Ala Leu Leu 
1 5 . . 10 15 

Val lie Phe Gin Val Cys Leu Cys Cln Asp Glu Val Thr Asp Asp Tyr 
20 25 30 

lie Gly Asp Asn Thr Thr Val Asp Tyr Thr Leu Phe Glu Ser Leu Cys 
35 40 45 

Ser Lys Lys Asp Val Arg Asn Phe Lys Ala Trp Phe Leu Pro lie Met 

,50 .55 60 

Tyr Ser lie lie Cys Phe Val Gly. Leu Leu Gly Asn Gly Leu Val Val 
65 70 75 80 

Leu Thr Tyr lie Tyr Phe Lys Arg Leu Lys Thr Met Thr Asp- Thr Tyr 
85 90 95 

.Leu Leu Asn Leu Ala Val Ala Asp lie Leu Phe Leu Leu Thr Leu Pro 
100 105 110 

Phe Trp Ala Tyr Ser Ala Ala Lys Ser Trp Val Phe Gly Val His Phe 
115 120 125 

Cys Lys Leu lie Phe Ala lie Tyr Lys Met Ser Phe Phe Ser Gly Met 
130 135 140 

Leu Leu Leu Leu Cys lie Ser lie Asp Arg Tyr Val Ala He Val Gin 
145 150 155 160 

Ala Val Ser Ala His Arg His Arg Ala Arg Val Leu Leu He Ser Lys 
165 170 175 

Leu Ser Cys Val Gly He Trp He Leu Ala Thr Val Leu Ser He Pro 
180 185 190 

Glu Leu Leu Tyr Ser Asp Leu Gin Arg Ser Ser Ser Glu Gin Ala Met 
195 200 205 

Arg Cys Ser Leu He Thr Glu His Val Glu Ala Phe He Thr He Gin 
210 215 220 

Val Ala Gin Met Val He Gly Phe Leu Val Pro Leu Leu Ala Met Ser 
225 230 235 240 

Phe Cys Tyr Leu Val He He Arg Thr Leu Leu Gin Ala Arg Asn Phe 
245 250 255 

Glu Arg Asn Lys Ala He Lys Val He He Ala Val Val Val Val Phe 
260 265 270 



He Val Phe Gin Leu Pro Tyr Asn Gly Val Val Leu Ala Gin Thr Val 
275 280 285 
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Ala Asn Phe Asn He Thr Ser Ser Thr Cys Glu Leu Ser Lys Gin Leu 

Asn lie Ala Tyr Asp Val Thr Tyr Ser Leu Ala Cys Val Arg Cys Cys 

315 320 

val Asn Pro Phe Leu Tyr Ala Phe He Gly Val Lys Phe Arg Asn Asp 

330 

Leu Phe Lys Leu Phe Lys Asp Leu Gly Cys Leu Ser Gin Glu Gin Leu 

345 350 

Arg Gin Trp Ser Ser Cys Arg His He Arg Arg Ser Ser Met Ser Val 



3" 365 



Glu Ala Glu Thr Thr Thr Thr Phe Ser Pro 
370 375. 

(76) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 
* (A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75; 
CTGGAATTCA CCTGGACCAC CACCAATGGA TA 
(77) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
CTCGGATCCT GCAAAGTTTG TCATACAGTT 
(78) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1085 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



32 



30 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

ATGGATATAC AAATGGCAAA CAATTTTACT CCGCCCTCTG CAACTCCTCA GGGAAATGAC 60 

TGTGACCTCT ATGCACATCA CAGCACGGCC AGGATAGTAA TGCCTCTGCA TTACAGCCTC 120 

GTCTTCATCA TTGGGCTCGT GGGAAACTTA CTAGCCTTGG TCGTCATTGT TCAAAACAGG 180 

AAAAAAATCA ACTCTACCAC CCTCTATTCA ACAAATTTGG TGATTTCTGA TATACTTTTT 240 

ACCACGGCTT TGCCTACACG AATAGCCTAC TATGCAATGG GCTTTGACTG GAGAATCGGA 300 

GATGCCTTGT GTAGGATAAC TGCGCTAGTG TTTTACATCA ACACATATGC AGGTGTGAAC 360 

TTTATGACCT GCCTGAGTAT TGACCGCTTC ATTGCTGTGG TGCACCCTCT ACGCTACAAC 420 

AAGATAAAAA GGATTGAACA TGCAAAAGGC GTGTGCATAT TTGTCTGGAT TCTAGTATTT 480 

GCTCAGACAC TCCCACTCCT CATCAACCCT ATGTCAAAGC AGGA6GCTGA AAGGATTACA 540 

TGCATGGAGT ATCCAAACTT TGAAGAAACT AAATCTCTTC CCTGGATTCT GCTTGGGGCA 600 

TGTTTCATAG GATATGTACT TCCACTTATA ATCATTCTCA TCTGCTATTC TCAGATCTGC 660 

TGCAAACTCT TCAGAACTGC CAAACAAAAC CCACTCACTG AGAAATCTGG TGTAAACAAA 720 

AAGGCTCTCA ACACAATTAT TCTTATTATT GTTGTGTTTG TTCTCTGTTT CACACCTTAC 780 

CATGTTGCAA TTATTCAACA TATGATTAAG AAGCTTCGTT TCTCTAATTT CCTGGAATGT 840 

AGCCAAAGAC ATTCGTTCCA GATTTCTCTG CACTTTACAG TATGCCTGAT 6AACTTCAAT 900 

TGCTGCATGG ACCCTTTTAT CTACTTCTTT GCATGTAAAG GGTATAAGAG AAAGGTTATG 960 

AGGATGCTGA AACGGCAAGT CAGTGTATCG ATTTCTAGTG CTGTGAAGTC AGCCCCTGAA" 1020 

GAAAATTCAC GTGAAATGAC AGAAACGCAG ATGATGATAC ATTCCAAGTC TTCAAATGGA 1080 

^GTGA ^Ogg 
(79) INFORMATION FOR SEQ ID NO:78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 361 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Met Asp He Gin Met Ala Asn Asn Phe Thr Pro Pro Ser Ala Thr Pro 
^5 10 15 
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15 



20 



25 



30 



35 
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Gin Gly Asn Asp Cys Asp Leu Tyr Ala His His Ser Thr Ala Arg He 
20 25 30 

Val Met Pro Leu His Tyr Ser Leu Val Phe He He Gly Leu Val Gly 
35 40 45 

Asn Leu Leu Ala Leu Val Val He Val Gin Asn Arg Lys Lys He Asn 
50 55 60 

Ser Thr Thr Leu Tyr Ser Thr Asn Leu Val He Ser Asp He Leu Phe 
65 70 75 80 

Thr Thr Ala Leu Pro Thr Arg He Ala Tyr Tyr Ala Met Gly Phe Asp 
85 90 95 

Trp Arg He Gly Asp Ala Leu Cys Arg He Thr Ala Leu Val Phe Tyr 
100 105 110 

He Asn Thr Tyr Ala Gly Val Asn Phe Met Thr Cys Leu Ser He Asp 
115 120 125 

Arg Phe He Ala Val Val His Pro Leu Arg Tyr Asn Lys He Lys Arg 
130 135 140 

He Glu His Ala Lys Gly Val Cys He Phe Val Trp lie Leu Val Phe 
145 150 155 160 

Ala Gin Thr Leu Pro Leu Leu He Asn Pro Met Ser Lys Gin Glu Ala 
165 170 175 

Glu Arg He Thr Cys Met Glu Tyr Pro Asn Phe Glu Glu Thr Lys Ser 
180 185 190 

Leu Pro Trp He Leu Leu Gly Ala Cys Phe lie Gly Tyr Val Leu Pro 
195 200 • 205 

Leu He He He Leu He Cys Tyr Ser Gin He Cys Cys Lys Leu Phe 
210 215 220 

Arg. Thr Ala Lys Gin Asn Pro Leu Thr Glu Lys Ser Gly Val Asn Lys 
225 230 235 240 

Lys Ala Leu Asn Thr He He Leu He He Val Val Phe Val Leu Cys 
245 250 255 

Phe Thr Pro Tyr His Val Ala He He Gin His Met He Lys Lys Leu 
260 265 270 

Arg Phe Ser Asn Phe Leu Glu Cys Ser Gin Arg His Ser Phe Gin He 
275 280 285 

Ser Leu His Phe Thr Val Cys Leu Met Asn Phe Asn Cys Cys Met Asp 
290 295 300 

Pro Phe He Tyr Phe Phe Ala Cys Lys Gly Tyr Lys Arg Lys Val Met 
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305 310 315 320 

Arg Met Leu Lys Arg Gin Val Ser Val Ser lie Ser Ser Ala Val Lys 
325 330 335 

Ser Ala Pro Glu Glu Asn Ser Arg Glu Met Thr Glu Thr Gin Met Met 
5 340 345 350 

He His Ser Lys Ser Ser Asn Gly Lys 
355 360 

(80) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

CTGGAATTCT CCTGCTCATC CAGCCATGCG G 31 

(81) INFORMATION FOR SEQ ID N0:80: 

(i) SEQUENCE CH7VRACTERISTICS : 
(A) LENGTH: 30 base pairs 

20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
25 CCTGGATCCC CACCCCTACT GGGGCCTCAG 30 

(82) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1446 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
ATGCGGTGGC TGTGGCCCCT GGCTGTCTCT CTTGCTGTGA TTTTGGCTGT GGGGCTAAGC 60 



35 AGGGTCTCTG GGGGTGCCCC CCTGCACCTG GGCAGGCACA GAGCCGAGAC CCAGGAGCAG 120 
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CAGAGCCGAT CCAAGAGGGG 


CACCGAGGAT GAGGAGGCCA AGGGCGTGCA GCAGTATGTG 


180 


CCTGAGGAGT GGGCGGAGTA 


CCCCCGGCCC 


ATTCACCCTG 


CTGGCCTGCA GCCAACCAAG 


240 


CCCTTGGTGG CCACCAGCCC 


TAACCCCGAC 


AAGGATGGGG 


GCACCCCAGA CAGTGGGCAG 


300 


GAACTGAGGG GCAATCTGAC AGGGGCACCA GGGCAGAGGC 


TACAGATCCA GAACCCCCTG 


360 


TATCCGGTGA CCGAGAGCTC 


CTACAGTGCC 


TATGCCATCA 


TGCTTCTGGC GCTGGTGGTG 


420 


TTTGCGGTGG GCATTGTGGG 


CAACCTGTCG 


GTCATGTGCA 


TCGTGTGGCA CAGCTACTAC 


480 


CTGAAGAGCG CCTGGAACTC 


CATCCTTGCC 


AGCCTGGCCC 


TCTGGGATTT TCTGGTCCTC 


540 


TTTTTCTGCC TCCCTATTGT 


CATCTTCAAC 


GAGATCACCA AGCAGAGGCT ACTGGGTGAC 


600 


GTTTCTTGTC GTGCCGTGCC 


CTTCATGGAG 


GTCTCCTCTC 


TGGGAGTCAC GACTTTCAGC 


660 


CTCTGTGCCC TGGGCATTGA 


CCGCTTCCAC 


GTGGCCACCA 


GCACCCTGCC CAAGGTGAGG 


720 


CCCATCGAGC GGTGCCAATC 


CATCCTGGCC 


AAGTTGGCTG 


TCATCTGGGT GGGCTCCATG 


780 


ACGCTGGCTG TGCCTGAGCT 


CCTGCTGTGG 


CAGCTGGCAC 


AGGAGCCTGC CCCCACCATG 


840 


GGCACCCTGG ACTCATGCAT 


CATGAAACCC 


TCAGCCAGCC 


TGCCCGAGTC CCTGTATTCA 


900 


CTGGTGATGA CCTACCAGAA 


CGCCCGC7VTG 


TGGTGGTACT 


TTGGCTGCTA CTTCTGCCTG 


960 


CCCATCCTCT TCACAGTCAC 


CTGCCAGCTG 


GTGACATGGC 


GGGTGCGAGG CCCTCCAGGG 


1020 


AGGAAGTCAG AGTGCAGGGC CAGCAAGCAC GAGCAGTGTG 


AGAGCCAGCT CAACAGCACC 


1080 


GTGGTGGGCC TGACCGTGGT 


CTACGCCTTC 


TGCACCCTCC 


CAGAGAACGT CTGCAACATC 


1140 


GTGGTGGCCT ACCTCTCCAC 


CGAGCTGACC 


CGCCAGACCC 


TGGACCTCCT GGGCCTCATC 


1200 


AACCAGTTCT CCACCTTCTT 


CAAGGGCGCC 


ATCACCCCAG 


TGCTGCTGCT TTGCATCTGC 


1260 


AGGCCGCTGG GCCAGGCCTT 


CCTGGACTGC 


TGCTGCTGCT 


GCTGCTGTGA GGAGTGCGGC 


1320 


GGGGCTTCGG AGGCCTCTGC 


TGCCAATGGG 


TCGGACAACA 


AGCTCAAGAC CGAGGTGTCC 


1380 


TCTTCCATCT ACTTCCACAA 


GCCCAGGGAG 


TCACCCCCAC 


TCCTGCCCCT GGGCACACCT 


1440 


TGCTGA 








1446 



(83) INFORMATION FOR SEQ ID NO: 82: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 481 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 82: 

Met Arg Trp Leu Trp Pro Leu Ala Val Ser Leu Ala Val He Leu Ala 
15 10 15 

Val Gly Leu Ser Arg Val Ser Gly Gly Ala Pro Leu His Leu Gly Arg 
20 25 30 

His Arg Ala Glu Thr Gin Glu Gin Gin Ser Arg Ser Lys Arg Gly Thr 
35 40 45 

Glu Asp Glu Glu Ala Lys Gly Val Gin Gin Tyr Val Pro Glu Glu Trp 
50 55 60 

Ala Glu Tyr Pro Arg Pro He His Pro Ala Gly Leu Gin Pro Thr Lys 
65 70 75 80 

Pro Leu Val Ala Thr Ser Pro Asn Pro Asp Lys Asp Gly Gly Thr Pro 
85 90 95 

Asp Ser Gly Gin Glu Leu Arg Gly Asn Leu Thr Gly Ala Pro Gly Gin 
100 105 110 

Arg Leu Gin He Gin Asn Pro Leu Tyr Pro Val Thr Glu Ser Ser Tyr 
115 120 125 

Ser Ala Tyr Ala He Met Leu Leu Ala Leu Val Val Phe Ala Val Gly 
130 135 140 

He Val Gly Asn Leu Ser Val Met Cys He Val Trp His Ser Tyr Tyr 
145 150 155 160 

Leu Lys Ser Ala Trp Asn Ser He Leu Ala Ser Leu Ala Leu Trp Asp 
165 170 175 

Phe Leu Val Leu Phe Phe Cys Leu Pro He Val He Phe Asn Glu He 
180 185 190 

Thr Lys Gin Arg Leu Leu Gly Asp Val Ser Cys Arg Ala Val Pro Phe 
195 200 205 

Met Glu Val Ser Ser Leu Gly Val Thr Thr Phe Ser Leu Cys Ala Leu 
210 215 220 

Gly He Asp Arg . Phe His Val Ala Thr Ser Thr Leu Pro Lys Val Arg 
225 230 235 240 

Pro He Glu Arg Cys Gin Ser He Leu Ala Lys Leu Ala Val He Trp 
245 250 255 



Val Gly Ser Met Thr Leu Ala Val Pro Glu Leu Leu Leu Trp Gin Leu 
260 ^ 265 270 

Ala Gin Glu Pro Ala Pro Thr Met Gly Thr Leu Asp Ser Cys He Met 
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65 

275 280 285 

Lys Pro Ser Ala Ser Leu Pro Glu Ser Leu Tyr Ser Leu Val Met Thr 
290 295 300 

Tyr Gin Asn Ala Arg Met Trp Trp Tyr Phe Gly Cys Tyr Phe Cys Leu 

310 315 320 

Pro He Leu Phe Thr Val Thr Cys. Gin Leu Val Thr Trp Arg Val Arg 
325 330 

Gly Pro Pro Gly Arg Lys Ser Glu Cys Arg Ala Ser Lys His Glu Gin 
340 345 350 

Cys Glu Ser Gin Leu Asn Ser Thr Val Val Gly Leu Thr Val Val Tyr 
355 360 365 

Ala Phe Cys Thr Leu Pro Glu Asn Val Cys Asn He Val Val Ala Tyr 
370 375 380 

Leu Ser Thr Glu Leu Thr Arg Gin Thr Leu Asp Leu Leu Gly Leu He 

390 395 400 

Asn Gin Phe Ser Thr Phe Phe Lys Gly Ala He Thr Pro Val Leu Leu 
405 410 415 

Leu Cys He Cys Arg Pro Leu Gly Gin Ala Phe Leu Asp Cys Cys Cys 
420 425 430 

Cys Cys Cys Cys Glu Glu Cys Gly Gly Ala Ser Glu Ala Ser Ala Ala 
435 440 445 

Asn Gly Ser Asp Asn Lys Leu Lys Thr Glu Val Ser Ser Ser He Tyr 
450 455 460 



Phe His Lys Pro Arg Glu Ser Pro Pro Leu Leu Pro Leu Gly Thr Pro 
Cys 



^" 470 475 480 



(84) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) liENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY^: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
ATGTGGAACG CGACGCCCAG CG 



22 
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(85) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
TCATGTATTA ATACTAGATT CT 22 

(86) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
TACCATGTGG AACGCGACGC CCAGCGAAGA GCCGGGGT 38 

(87) INFORMATION FOR SEQ ID NO: 86: 

• (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
cggaattCat GTATTAATAC TAGATTCTGT CCAGGCCCG 39 

(88) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1101 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
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ATCTGGAACG CGACGCCCAG CGAAGAGCCG GGGTTCAACC TCACACTGGC CGACCTGGAC 60 
TCGGATGCTT CCCCCGGCAA CGACTCGCTG GGCGACGAGC TCCTGCAGCT CTTCCCCGCG 120 
CCGCTGd^G OGGGCGTCAC AGCCACCTGC GTGGCACTCT TCGTGGOtSGG TATCGCTGGC 180 
AACCTGCTCA CC;.TGCTGGT GGTGTCGCGC .TCCGCGAGC TCCGCACCAC CACCAACCTC 240 
5 TACCTGTCCA QCATGGCCTT CTCCGATCTG CTCATCTTCC TCTGCA-«3CC CCTGGACCTC 
GrrCGCCTCT GGCAGTACCG GCCCTGGAAC TTCGGOGACC TCCTC1.3CAA ACTCTTCCAA 
TTCGTCAGTG AGAGCTGCAC CTACGCCACG GTGCTCACCA TCACAGCGCT GAGCGTCGAG 
CGCTACTTCG CCATCTOCTT CCCACTCCGG GCCAAGGTGG TGGTCACCAA GGGGCXSGGTC 
AAGCTGGTCA TCTTCGTCAT CTGGGCCGTG GCCTTCTGCA GCGCCGGGCC CATCTTCGT6 
10 CTAGTCGGGG TGGAGCACGA GAACGGCACC GACCCTTGGG ACACCAACGA GTGCCGCCCC 
ACCGAGTTTC CGGTGCGCTC TGGACTGCTC ACGGTCATGG TCTGGGTGTC CAGCATCITC 
TTCrrCCTTC CTGTCTTCTG TCTCACGGTC CTCTACAGTC TCATCGGCAG GAAGCl^TGG 
CGGAGGAGGC GCGGCGATGC TGTCGTOGGT GCCTCGCTCA GGGACCAGAA CCACAAGCAA 
ACCGT6AAAA TGCTGGCTGT AGTGGTGTTT GCCTTCATCC TCTGCTGGCT CCCCTTCCAC 
15 GTAGGGCGAT ArTTAT-TrrC CAAATCCrTT GAGCCTGGCT CCTTGGAGAT TGCTCAGATC 
AGCCAGTACT GCAACCTCGT GTCCTTTGTC CTCTTCTACC TCAGTGCTCC CATCAACCCC 
ATTCTGTACA ACATCATGTC CAAGAAGTAC CGGGTGGCAG TGTTCAGACT TCTGGGATTC 1020 
GAACCCITCT CCCAGAGAAA GCTCTCCACT CTGAAAGATG A;^GTTCTCG GGCCO^ACA 1080 
GAATCTAGTA TTAATACATG A 

1101 

20 (89) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 366 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:88: 

Met Trp Asn Ala Thr Pro Ser Glu Glu Pro Gly Phe Asn Leu Thr Leu 
5 10 15 

Ala ASP Leu Asp Trp Asp Ala Ser Pro Gly Asn Asp Ser Leu Gly Asp 



300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



30 



30 
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Glu Leu Leu Gin Leu Phe Pro Ala Pro Leu Leu Ala Gly Val Thr Ala 
35 40 45 

Thr Cys Val Ala Leu Phe Val Val Gly lie Ala Gly Asn Leu Leu Thr 
50 55 60 

Met Leu Val Val Ser Arg Phe Arg Glu Leu Arg Thr Thr Thr Asn Leu 
65 70 75 80 

Tyr Leu Ser Ser Met Ala Phe Ser Asp Leu Leu lie Phe Leu Cys Met 
85 90 95 

Pro Leu Asp Leu Val Arg Leu Trp Gin Tyr Arg Pro Trp Asn Phe Gly 
100 105 110 

Asp Leu Leu Cys Lys Leu Phe Gin Phe Val Ser Glu Ser Cys Thr Tyr 
115 120 125 

Ala Thr Val Leu Thr lie Thr Ala Leu Ser Val Glu Arg Tyr Phe Ala 
130 135 140 

lie Cys Phe Pro Leu Arg Ala Lys Val Val Val Thr Lys Gly Arg Val 
145 150 155 160 

Lys Leu Val lie Phe Val He Trp Ala Val Ala Phe Cys Ser Ala Gly 
165 170 175 

Pro He Phe Val Leu Val Gly Val Glu His Glu Asn Gly Thr Asp Pro 
180 185 190 

Trp Asp Thr Asn Glu Cys Arg Pro Thr Glu Phe Ala Val Arg Ser Gly 
195 200 205 

Leu Leu Thr Val Met Val Trp Val Ser Ser lie Phe Phe Phe Leu Pro 
210 215 220 

Val Phe Cys Leu Thr Val Leu Tyr Ser Leu He Gly Arg Lys Leu Trp 
225 230 235 240 

Arg Arg Arg Arg Gly Asp Ala Val Val Gly Ala Ser Leu Arg Asp Gin 
245 250 255 



Asn His Lys Gin Thr Val Lys Met Leu Ala Val Val Val Phe Ala Phe 

260 265 270 

He Leu Cys Trp Leu Pro Phe His Val Gly Arg Tyr Leu Phe Ser Lys 

275 280 285 

Ser Phe Glu Pro Gly Ser Leu Glu He Ala Gin He Ser Gin Tyr Cys 

290 295 300 

Asn Leu Val Ser Phe Val Leu Phe Tyr Leu Ser Ala Ala He Asn Pro 

305 310 315 320 

He Leu Tyr Asn He Met Ser Lys Lys Tyr Arg Val Ala Val Phe Arg 
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325 330 335 

Leu Leu Gly Phe Glu Pro Phe Ser Gin Arg Lys Leu Ser Thr Leu Lys 
340 345 35Q 

Asp Glu Ser Ser Arg Ala Trp Thr Glu Ser Ser lie Asn Thr 
355 360 365 

(90) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



33 



30 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
GCAAGCTTGT GCCCTCACCA AGCCATGCGA GCC 
15 (91) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

CGGAATTCAG CAATGAGTTC CGACAGAAGC 
(92) INFORMATION FOR SEQ ID NO: 91: 

25 (i) SEQUENCE CHTUIACTERISTICS : 

(A) LENGTH: 1842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

ATGCGAGCCC CGGGCGCGCT TCTCGCCCGC ATGTCGCGGC TACTGCTTCT GCTACTGCTC 60 

AAGGTGTCTG CCTCTTCTGC CCTCGGGGTC GCCCCTGCGT CCAGAAACGA AACTTGTCTG 120 

GGGGAGAGCT GTGCACCTAC AGTGATCCAG CGCCGCGGCA GGGACGCCTG GGGACCGGGA 180 

35 AATTCTGCAA GAGACGTTCT GCGAGCCCGA 6CACCCAGGG AGGAGCAGGG GGCAGCGTTT 240 
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CTTGCGGGAC CCTCCTGGGA CCTGCCGGCG GCCCCGGGCC GTGACCCGGC TGCAGGCAGA 300 

GGGGCGGAGG CGTCGGCAGC CGGACCCCCG GGACCTCCAA CCAGGCCACC TGGCCCCTGG 360 

AGGTGGAAAG GTGCTCGGGG TCAGGAGCCT TCTGAAACTT TGGGGAGAGG GAACCCCACG 420 

GCCCTCCAGC TCTTCCTTCA GATCTCAGAG GAGGAAGAGA AGGGTCCCAG AGGCGCTGGC 480 

5. ATTTCCGGGC GTAGCCAGGA GCAGAGTGTG AAGACAGTCC CCGGAGCCAG CGATCTTTTT 540 

TACTGGCCAA GGAGAGCCGG GAAACTCCAG GGTTCCCACC ACAAGCCCCT GTCCAAGACG 600 

GCCAATGGAC TGGCGGGGCA CGAAGGGTGG ACAATTGCAC TCCCGGGCCG GGCGCTGGCC 660 

CAGAATGGAT CCTTGGGTGA AGGAATCCAT GAGCCTGGGG GTCCCCGCCG GGGAAACAGC 720 

ACGAACCGGC GTGTGAGACT GAAGAACCCC TTCTACCCGC TGACCCAGGA GTCCTAT6GA 780 

10 GCCTACGCGG TCATGTGTCT GTCCGTGGTG ATCTTCGGGA CCGGCATCAT TGGC7VACCTG 840 

GCGGTGATGA GCATCGTGTG CCACAACTAC TACATGCGGA GCATCTCCAA CTCCCTCTTG 900 

GCCAACCTGG CCTTCTGGGA CTTTCTCATC ATCTTCTTCT GCCTTCCGCT GGTCATCTTC 960 

CACGAGGTGA CCAAGAAGTG GCTGCTGGAG GACTTCTCCT GCAAGATCGT GCCCTATATA 1020 

GAGGTCGCTT CTCTGGGAGT CACCACTTTC ACCTTATGTG CTCTGTGCAT AGACCGCTTC 1080 

15 CGTGCTGCCA CCAACGTACA GATGTACTAC GAAATGATCG AAAACTGTTC CTCAACAACT 1140 

GCCAAACTTG CTGTTATATG GGTGGGAGCT CTATTGTTAG CACTTCCAGA AGTTGTTCTC .1200 

CGCCAGCTGA GCAAGGAGGA TTTGGGGTTT AGTGGCCGAG CTCCGGCAGA AAGGTGCATT 1260 

ATTAAGATCT CTCCTGATTT ACCAGACACC ATCTATGTTC TAGCCCTCAC CTACGACAGT 1320 

GCGAGACTGT GGTGGTATTT TGGCTGTTAC TTTTGTTTGC CCACGCTTTT CACCATCACC 1380 

20 TGCTCTCTAG TGACTGCGAG GAAAATCCGC AAAGCAGAGA AAGCCTGTAC CCGAGGGAAT 1440 

AAACGGCAGA TTCAACTAGA GAGTCAGATG AACTGTACAG TAGTGGCACT GACCATTTTA 1500 

TATGGATTTT GCATTATTCC TGAAAATATC TGCAACATTG TTACTGCCTA CATGGCTACA 1560 

GGGGTTTCAC AGCAGACAAT GGACCTCCTT AATATCATCA GCCAGTTCCT TTTGTTCTTT 1620 

AAGTCCTGTG TCACCCCAGT CCTCCTTTTC TGTCTCTGCA AACCCTTCAG TCGGGCCTTC 1680 

25 ATGGAGTGCT GCTGCTGTTG CTGTGAGGAA TGCATTCAGA AGTCTTCAAC GGTGACCAGT 1740 

GATGACAATG ACAACGAGTA CACCACGGAA CTCGAACTCT CGCCTTTCAG TACCATACGC 1800 

CGT6AAATGT CCACTTTTGC TTCTGTCGGA ACTCATTGCT GA 1842 
(93) INFORMATION FOR SEQ ID NO: 92: 
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<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Met Arg Ala Pro Gly Ala Leu Leu Ala Arg Met Ser Arg Leu Leu Leu 
1 5 10 15 

Leu Leu Leu Leu Lys Val Ser Ala Ser Ser Ala Leu Gly Val Ala Pro 
20 25 30 

Ala Ser Arg Asn Glu Thr Cys Leu Gly Glu Ser Cys Ala Pro Thr Val 
35 40 45 

He Gin Arg Arg Gly Arg Asp Ala Trp Gly Pro Gly Asn Ser Ala Arg 
50 55 • 60 



Asp Val Leu Arg Ala Arg Ala Pro Arg Glu Glu Gin Gly Ala Ala Phe 
S5 70 75 80 

Leu Ala Gly Pro Ser Trp Asp Leu Pro Ala Ala Pro Gly Arg Asp Pro 
85 90 95 

Ala Ala Gly Arg Gly Ala Glu Ala Ser Ala Ala Gly Pro Pro Gly Pro 
100 105 110 

Pro Thr Arg Pro Pro Gly Pro Trp Arg Trp Lys Gly Ala Arg Gly Gin 
115 120 125 

Glu Pro Ser Glu Thr Leu Gly Arg Gly Asn Pro Thr Ala Leu Gin Leu 
130 135 140 

Phe Leu Gin He Ser Glu Glu Glu Glu Lys Gly Pro Arg Gly Ala Gly 
145 150 155 160 

He Ser Gly Arg Ser Gin Glu Gin Ser Val Lys Thr Val Pro Gly Ala 
165 170 175 

Ser Asp Leu Phe Tyr Trp Pro Arg Arg Ala Gly Lys Leu Gin Gly Ser 
180 185 190 

His His Lys Pro Leu Ser Lys Thr Ala Asn Gly Leu Ala Gly His Glu 
195 200 205 

Gly Trp Thr He Ala Leu Pro Gly Arg Ala Leu Ala Gin Asn Gly Ser 
210 215 220 



Leu Gly Glu Gly He His Glu Pro Gly Gly Pro Arg Arg Gly Asn Ser 
225 230 235 240 
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Thr Asn Arg Arg Val Arg Leu Lys Asn Pro Phe Tyr Pro Leu Thr Gin 
245 250 255 

Glu Ser Tyr Gly Ala Tyr Ala Val Met Cys Leu Ser Val Val lie Phe 
260 265 270 

5 Gly Thr Gly He He Gly Asn Leu Ala Val Met Ser He Val Cys His 

275 280 285 

Asn Tyr Tyr Met Arg Ser He Ser Asn Ser Leu Leu Ala Asn Leu Ala 
290 295 300 

Phe Trp Asp Phe Leu He He Phe Phe Cys Leu Pro Leu Val He Phe 
10 305 310 315 320 

His Glu Leu Thr Lys Lys Trp Leu Leu Glu Asp Phe Ser Cys Lys He 
325 330 335 

Val Pro Tyr He Glu Val Ala Ser Leu Gly Val Thr Thr Phe Thr Leu 
340 345 350 

15 Cys Ala Leu Cys He Asp Arg Phe Arg Ala Ala Thr Asn Val Gin Met 

355 360 365 

Tyr Tyr Glu Met He Glu Asn Cys Ser Ser Thr Thr Ala Lys Leu Ala 
370 375 380 

Val He Trp Val Gly Ala Leu Leu Leu Ala Leu Pro Glu Val Val Leu 
20 385 390 395 400 

Arg Gin Leu Ser Lys Glu Asp Leu Gly Phe Ser Gly Arg Ala Pro Ala 
405 410 415 

Glu Arg Cys He He Lys He Ser Pro Asp Leu Pro Asp Thr He Tyr 
420 425 V 430 

25 Val Leu Ala Leu Thr Tyr Asp Ser Ala Arg Leu Trp Trp Tyr Phe Gly 

435 440 445 



Cys Tyr Phe Cys Leu 
450 

Thr Ala Arg Lys He 
30 465 

Lys Arg Gin He Gin 
485 

Leu Thr He Leu Tyr 
500 

35 He Val Thr Ala Tyr 

515 

Leu Leu Asn He He 



Pro Thr Leu Phe Thr He 

455 

Arg Lys Ala Glu Lys Ala 
470 475 

Leu Glu Ser Gin Met Asn 
490 

Gly Phe Cys He He Pro 
505 

Met Ala Thr Gly Val Ser 
520 

Ser Gin Phe Leu Leu Phe 



Thr Cys Ser Leu Val 
460 

Cys Thr ^Arg Gly Asn 
480 

Cys Thr Val Val Ala 
495 

Glu Asn He Cys Asn 
510 

Gin Gin Thr Met Asp 
525 

Phe Lys Ser Cys Val 
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530 535 540 

Thr Pro Val Leu Leu Phe Cys Leu Cys Lys Pro Phe Ser Arg Ala Phe 

"0 555 560 

Met Glu Cys Cys Cys Cys Cys Cys Glu Glu Cys He. Gin Lys Ser Ser 
565 . 570 

Thr Val Thr Ser Asp Asp Asn Asp Asn Glu Tyr Thr Thr Glu Leu Glu 
. 580 585 590 

Leu Ser Pro Phe Ser Thr He Arg Arg Glu Met Ser Thr Phe Ala Ser 
595 600 605 . 

Val Gly Thr His Cys 
610 

(94) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 
CAGAATTCAG AGAAAAAAAG TGAATATGGT TTTT 

(95) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

TTGGATCCCT GGTGCATAAC AATTGAAAGA AT 

(96) INFORMATION FOR SEQ ID NO: 95: 



34 



32 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1248 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



wo 00/22129 PCT/US99/23938 

74 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

ATGGTTTTTG CTCACAGAAT GGATAACAGC AAGCCACATT TGATTATTCC TACACTTCTG 60 

GTGCCCCTCC AAAACCGCAG CTGCACTGAA ACAGCCACAC CTCTGCCAAG CCAATACCTG 120 

ATGGAATTAA GTGAGGAGCA CAGTTGGATG AGCAACCAAA CAGACCTTCA CTATGTGCTG 180 

AAACCCGGGG AAGTGGCCAC AGCCAGCATC TTCTTTGGGA TTCTGTGGTT GTTTTCTATC 240 

TTCGGCAATT CCCTGGTTTG TTTGGTCATC CATAGGAGTA GGAGGACTCA GTCTACCACC 300 

AACTACTTTG TGGTCTCCAT GGCATGTGCT GACCTTCTCA TCAGCGTTGC CAGCACGCCT 360 

TTCGTCCTGC TCCAGTTCAC CACTGGAAGG TGGACGCTGG GTAGTGCAAC GTGCAAGGTT 420 

GTGCGATATT TTCAATATCT CACTCCAGGT GTCCAGATCT ACGTTCTCCT CTCCATCTGC 480 

ATAGACCGGT TCTACACCAT CGTCTATCCT CTGAGCTTCA AGGTGTCCAG AGAAAAAGCC 540 

AAGAAAATGA TTGCGGCATC GTGGATCTTT GATGCAGGCT TTGTGACCCC TGTGCTCTTT 600 

* TTCTATGGCT CCAACTGGGA CAGTCATTGT AACTATTTCC TGCCCTCCTC TTGGGAAGGC 660 

ACTGCCTACA CTGTCATCCA CTTCTTGGTG GGCTTTGTGA TTCCATCTGT CCTCATAATT 720 
TTATTTTACC AAAAGGTCAT AAAATATATT TGGAGAATAG GCACAGATGG CCGAACGGTG > 780 

AGGAGGACAA TGAACATTGT CCCTCGGACA AAAGTGAAAA CTATCAAGAT GTTCCTCATT 840 

TTAAATCTGT TGTTTTTGCT CTCCTGGCTG CCTTTTCATG TAGCTCAGCT ATGGCACCCC 900 

CATGAACAAG ACTATAAGAA AAGTTCCCTT GTTTTCACAG CTATCACATG GATATCCTTT 960 

AGTTCTTCAG CCTCTAAACC TACTCTGTAT TCAATTTATA ATGCCAATTT TCGGAGAGGG 1020 

ATGAAAGAGA CTTTTTGCAT GTCCTCTATG A7VATGTTACC GAAGCAATGC CTATACTATC 1080 

ACAACAAGTT CAAGGATGGC CAAAAAAAAC TACGTTGGCA TTTCAGAAAT CCCTTCCATG 1140 

GCCAAAACTA TTACCAAAGA CTCGATCTAT GACTCATTTG ACAGAGAAGC CAAGGAAAAA 1200 

AAGCTTGCTT GGCCCATTAA CTCAAATCCA CCAAATACTT TTGTCTAA 1248 
(97) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 415 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
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Met Val Phe Ala His Arg Met Asp Asn Ser Lys Pro His Leu He He 
1 5 10 15 

Pro Thr Leu Leu Val Pro Leu Gin Asn Arg Ser Cys Thr Glu Thr Ala 
20 25 30 

5 Thr Pro Leu Pro Ser Gin Tyr Leu Met Glu Leu Ser Glu Glu His Ser 

35 40 45 

Trp Met Ser Asn Gin Thr Asp l^eu His Tyr Val Leu Lys Pro Gly Glu 
50 55 60 

Val Ala Thr Ala Ser He Phe Phe Gly He Leu Trp Leu Phe Ser He 
10 65 70 75 80 

Phe Gly Asn Ser Leu Val Cys Leu Val He His Arg Ser Arg Arg Thr 
85 90 95 

Gin Ser Thr Thr Asn Tyr Phe Val Val Ser Met Ala Cys Ala Asp Leu 
100 105 110 

15 Leu He Ser Val Ala Ser Thr Pro Phe Val Leu Leu Gin Phe Thr Thr 

115 120 125 

Gly Arg Trp Thr Leu Gly Ser Ala Thr Cys Lys Val Val Arg Tyr Phe 
130 135 140 

Gin Tyr Leu Thr Pro Gly Val Gin He Tyr Val Leu Leu Ser He Cys 
20 145 150 155 160 

He Asp Arg Phe Tyr Thr He Val Tyr Pro Leu Ser Phe Lys Val Ser 
165 170 175 

Arg Glu Lys Ala Lys Lys Met He Ala Ala Ser Trp He Phe Asp Ala 
180 185 190 

25 Gly Phe Val Thr Pro Val Leu Phe Phe Tyr Gly Ser Asn Trp Asp Ser 

195 200 205 

His Cys Asn Tyr Phe Leu Pro Ser Ser Trp Glu Gly Thr Ala Tyr Thr 
210 215 220 

Val He His Phe Leu Val Gly Phe Val He Pro Ser Val Leu He He 
30 225 230 235 240 

Leu Phe Tyr Gin Lys Val He Lys Tyr He Trp Arg He Gly Thr Asp 
245 250 255 

Gly Arg Thr Val Arg Arg Thr Met Asn He Val Pro Arg Thr Lys Val 
260 265 270 

35 Lys Thr He Lys Met Phe Leu He Leu Asn Leu Leu Phe Leu Leu Ser 

275 280 285 
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Trp Leu Pro Phe His Val Ala Gin Leu Trp His Pro His Glu Gin Asp 
290 295 300 

Tyr Lys Lys Ser Ser Leu Val Phe Thr Ala lie Thr Trp lie Ser Phe 

310 315 320 

5 Ser Ser Ser Ala Ser Lys Pro Thr Leu Tyr Ser He Tyr Asn Ala Asn 

325 330 335 

Phe Arg Arg Gly Met Lys Glu Thr Phe Cys Met Ser Ser Met Lys Cys 
340 345 350 

Tyr Arg Ser Asn Ala Tyr Thr He Thr Thr Ser Ser Arg Met Ala Lys 
10 355 360 365 

Lys Asn Tyr Val Gly He Ser Glu He Pro Ser Met Ala Lys Thr He 
370 375 380 

Thr Lys Asp Ser He Tyr Asp Ser Phe Asp Arg Glu Ala Lys Glu Lys 
385 390 395 400 

15 Lys Leu Ala Trp Pro He Asn Ser Asn Pro Pro Asn Thr Phe Val 

405 410 415 

(98) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 base pairs 
20. (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

. (ii) MOLECULE" TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
25 GGAAAGCTTA ACQATCCCCA GGAGCAACAT 

(99) INFORMATION FOR SEQ ID NO: 98: 



30 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 31 base pairs 

' (B) TYPE: nucleic acid 
30 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
CTC^GGATCCT ACGAGAGCAT TTTTCACACA G 31 
35 (100) INFORMATION FOR SEQ ID NO: 99: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

ATGGGGCCCA CCCTAGCGGT TCCCACCCCC TATGGCTGTA TTGGCTGTAA GCTACCCCAG 60 

CCAGAATACC CACCGGCTCT AATCATCTTT ATGTTCTGCG CGATGGTTAT CACCATCGTT 120 

GTAGACCTAA TCGGCAACTC CATGGTCATT TTGGCTGTGA CGAAGAACAA GAAGCTCCGG 180 

AATTCTGGCA ACATCTTCGT GGTCAGTCTC TCTGTGGCCG ATATGCTGGT GGCCATCTAC 240 

CCATACCCTT TGATGCTGCA TGCCATGTCC ATTGGGGGCT GGGATCTGAG CCAGTTACAG 300 

TGCCAGATGG TCGGGTTCAT CACAGGGCTG AGTGTGGTCG GCTCCATCTT CAACATCGTG 360 

GCAATCGCTA TCAACCGTTA CTGCTACATC TGCCACAGCC TCCAGTACGA ACGGATCTTC 420 

AGTGTGCGCA ATACCTGCAT CTACCTGGTC ATCACCTGGA TCATGACCGT CCTGGCTGTC 480 

CTGCCCAACA TGTACATTGG CACCATCGAG TACGATCCTC GCACCTACAC CTGCATCTTC 540 

AACTATCTGA ACAACCCTGT CTTCACTGTT ACCATCGTCT GCATCCACTT CGTCCTCCCT 600 

CTCCTCATCG TGGGTTTCTG CTACGTGAGG ATCTGGACCA AAGTGCTGGC GGCCCGTGAC 660 

CCTGCAGGGC AGAATCCTGA CAACCAACTT GCTGAGGTTC GCAATTTTCT AACCATGTTT 720 

GTGATCTTCC TCCTCTTTGC AGTGTGCTGG TGCCCTATCA ACGTGCTCAC TGTCTTGGTG 780 

GCTGTCAGTC CGAAGGAGAT GGCAGGCAAG ATCCCCAACT GGCTTTATCT TGCAGCCTAC 840 

TTCATAGCCT ACTTCAACAG CTGCGTCAAC GCTGTGATCT ACGGGCTCCT CAATGAGAAT 900 . 

TTCCGAAGAG AATACTGGAC CATCTTCCAT GCTATGCGGC ACCCTATCAT ATTCTTCCCT 960 

GGCCTCATCA GTGATATTCG TGAGATGCAG GAGGCCCGTA CCCTGGCCCG CGCCCGTGCC 1020 

CATGCTCGCG ACCAAGCTCG TGAACAAGAC CGTGCCCATG CCTGTCCTGC TGTGGAGGAA 1080 

ACCCCGATGA ATGTCCGGAA TGTTCCATTA CCTGGTGATG CTGCAGCTGG CCACCCCGAC 1140 

CGTGCCTCTG GCCACCCTAA GCCCCATTCC A6ATCCTCCT CTGCCTATCG CAAATCTGCC 1200 

TCTACCCACC ACAAGTCTGT CTTTAGCCAC TCCAAGGCTG CCTCTGGTCA CCTCAAGCCT 1260 

GTCTCTGGCC ACTCCAAGCC TGCCTCTGGT CACCCCAAGT CTGCCACTGT CTACCCTAAG 1320 

CCTGCCTCTG TCCATTTCAA GGGTGACTCT GTCCATTTCA AGGGTGACTC TGTCCATTTC 1380 
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AAGCCTGACT CTGTTCATTT CAAGCCTGCT TCCAGCAACC CCAAGCCCAT CACTGGCCAC 1440 

CATGTCTCTG CTGGCAGCCA CTCCAAGTCT GCCTTCAGTG CTGCCACCAG CCACCCTAAA 1500 

CCCATCAAGC CAGCTACCAG CCATGCTGAG CCCACCACTG CTGACTATCC CAAGCCTGCC ISCO 

ACTACCAGCC ACCCTAAGCC CGCTGCTGCT GACAACCCTG AGCTCTCTGC CTCCCATTGC 1620 

CCCGAGATCC CTGCCATTGC CCACCCTGTG TCTGACGACA GTGACCTCCC TGAGTCGGCC 1680 

TCTAGCCCTG CCGCTGGGCC CACCAAGCCT GCTGCCAGCC AGCTGGAGTC TGACACCATC 1740 

GCTGACCTTC CTGACCCTAC TGTAGTCACT ACCAGTACCA ATGATTACCA TGATGTCGTG 1800 

GTTGTTGATG TTGAAGATGA TCCTGATGAA ATGGCTGTGT GA 1842 
(101) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(X» TOPOLOGY: not relevant 

(ii) MOLECUIiE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Met. Gly Pro Thr Leu Ala Val Pro Thr Pro Tyr Gly Cys He Gly Cys 
15 10 15 

Lys Leu Pro Gin Pro Glu Tyr Pro Pro Ala Leu He He Phe Met Phe 
20 25 30 

Cys Ala Met Val He Thr He Val Val Asp Leu He Gly Asn Ser Met 
35 40 45 

Val He Leu Ala Val Thr Lys Asn Lys Lys Leu Arg Asn Ser Gly Asn 
50 55 60 ' 

He Phe Val Val Ser Leu Ser Val Ala Asp Met Leu Val Ala He Tyr 
^5 70 75 80 

Pro Tyr Pro Leu Met Leu His Ala Met Ser He Gly Gly Trp Asp Leu 
85 90 95 

Ser Gin Leu Gin Cys Gin Met Val Gly Phe He Thr Gly Leu Ser Val 
100 105 110 

Val Gly Ser He Phe Asn He Val Ala He Ala He Asn Arg Tyr Cys 
115 120 125 

Tyr He Cys His Ser Leu Gin Tyr Glu Arg He Phe Ser Val Arg Asn 
130 135 140 
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Thr Cys He Tyr Leu Val He Thr Trp He Met Thr Val Leu Ala Val 
"5 150 155 160 

Leu Pro Asn Met Tyr He Gly Thr He Glu Tyr Asp Pro Arg Thr Tyr 
165 170 175 

Thr Cys He Phe Asn Tyr Leu Asn Asn Pro Val Phe Thr Val Thr He 
180 185 190 

Val Cys He His Phe Val Leu Pro Leu Leu He Val Gly Phe Cys Tyr 
195 200 205 

Val Arg He Trp Thr Lys Val Leu Ala Ala Arg Asp Pro Ala Gly Gin 
210 215 220 

Asn Pro Asp Asn Gin Leu Ala Glu Val Arg Asn Phe Leu Thr Met Phe 
225 230 235 240 

Val He Phe Leu Leu Phe Ala Val Cys Trp Cys Pro He Asn Val Leu 
245 250 255 

Thr Val Leu Val Ala Val Ser Pro Lys Glu Met Ala Gly Lys He Pro 
260 265 270 

Asn Trp Leu Tyr Leu Ala Ala Tyr Phe He Ala Tyr Phe Asn Ser Cys 
275 280 285 

Leu Asn Ala Val He Tyr Gly Leu Leu Asn Glu Asn Phe Arg Arg Glu 
290 295 300 

Tyr Trp Thr He Phe His Ala Met Arg His Pro He He Phe Phe Pro 
305 310 315 320 

Gly Leu He Ser Asp lie Arg Glu Met Gin Glu Ala Arg Thr Leu Ala 
325 330 335 

Arg Ala Arg Ala His Ala Arg Asp Gin Ala Arg Glu Gin Asp Arg Ala 
340 345 350 

His Ala Cys Pro Ala Val Glu Glu Thr Pro Met Asn Val Arg Asn Val 
355 360 365 

Pro Leu Pro Gly Asp Ala Ala Ala Gly His Pro Asp Arg Ala Ser Gly 
370 375 380 

His Pro Lys Pro His Ser Arg Ser Ser Ser Ala Tyr Arg Lys Ser Ala 

390 395 400 

Ser Thr His His Lys Ser Val Phe Ser His Ser Lys Ala Ala Ser Gly 
405 410 415 



His Leu Lys Pro Val Ser Gly His Ser Lys Pro Ala Ser Gly His Pro 
420 425 430 

Lys Ser Ala Thr Val Tyr Pro Lys Pro Ala Ser Val His Phe Lys Gly 
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435 440 445 

Asp Ser Val His Phe Lys Gly Asp Ser Val His Phe Lys Pro Asp Ser 
450 455 460 

Val His Phe Lys Pro Ala Ser Ser Asn Pro Lys Pro lie Thr Gly His 
465 470 475 480 

His Val Ser Ala Gly Ser His Ser Lys Ser Ala Phe Ser Ala Ala Thr 
485 490 495 

Ser His Pro Lys Pro lie Lys Pro Ala Thr Ser His Ala Glu Pro Thr 
500 505 510 

Thr Ala Asp Tyr Pro Lys Pro Ala Thr Thr Ser His Pro Lys Pro Ala 
515 520 525 

Ala Ala Asp Asn Pro Glu Leu Ser Ala Ser His Cys Pro Glu He Pro 
530 535 540 • 

Ala He Ala His Pro Val Ser Asp Asp Ser Asp Leu Pro Glu Ser Ala 
545 550 555 560 

Ser Ser Pro Ala Ala Gly Pro Thr Lys Pro Ala Ala Ser Gin Leu Glu 
565 570 575 

Ser Asp Thr He Ala Asp Leu Pro Asp Pro Thr Val Val Thr Thr Ser 
580 585 590 

Thr Asn Asp Tyr His Asp Val Val Val Val Asp Val Glu Asp Asp Pro 
595 600 605 

Asp Glu Met Ala Val 
610 

(102) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDETNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 
TCCAAGCTTC GCCATGGGAC ATAACGGGAG CT 32 

(103) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOIXX3Y: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
CGTGAATTCC AAGAATTTAC AATCCTTGCT 3 
(104) INFORMATION FOR SEQ ID NO: 103: 



(i) 


SEQUENCE CHARACTERISTICS: 








(A) LENGTH: 1548 base pairs 








(B) TYPE: nucleic, acid 








(C) STRANDEDNESS : single 








(D) TOPOLOGY: linear 






(ii) 


MOLECULE TYPE: DNA (genomic) 






(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 103: 






ATGGGACATA ACGGGAGCTG GATCTCTCCA AATGCCAGCG AGCCGCACAA 


CGCGTCCGGC 


60 


GCCGAGGCTG 


CGGGTGTGAA CCGCAGCGCG CTCGGGGAGT TCGGCGAGGC 


GCAGCTGTAC 


120 


CGCCAGTTCA 


CCACCACCGT GCAGGTCGTC ATCTTCATAG GCTCGCTGCT 


CGGAAACTTC 


180 


ATGGTGTTAT 


GGTCAACTTG CCGCACAAPp. rST/^TTr'afta'P r«'Tiri»f»r»n/-t*-»** 


CAGGTTCATT 


240 


AAAAACCTGG 


CCTGCTCGGG GATTTGTGCC AGCCTGGTCT GTGTGCCCTT 


CGACATCATC 


300 


CTCAGCACCA 


GTCCTCACTG TTGCTGGTGG ATCTACACCA TGCTCTTCTG 


CAAGGTCGTC 


360 


AAATTTTTGC 


ACAAAGTATT CTGCTCTGTG ACCATCCTCA GCTTCCCTGC 


TATTGCTTTG 


420 


GACAGGTACT 


ACTCAGTCCT CTATCCACTG GAGAGGAAAA TATCTGATGC 


CAAGTCCCGT 


480 


GAACTGGTGA 


TGTACATCTG GGCCCATGCA GTGGTGGCCA GTGTCCCTGT 


GTTTGCAGTA 


540 


ACCAATGTGG 


CTGACATCTA TGCCACGTCC ACCTGCACGG AAGTCTGGAG 


CAACTCCTTG 


600 


GGCCACCTGG 


TGTACGTTCT GGTGTATAAC ATCACCACGG TCATTGTGCC 


TGTGGTGGTG 


660 


GTGTTCCTCT 


TCTTGATACT GATCCGACGG GCCCTGAGTG CCAGCCAGAA 


GAAGAAGGTC 


720 


ATCATAGCAG 


CGCTCCGGAC CCCACAGAAC ACCATCTCTA TTCCCTATGC 


CTCCCAGCGG 


780 


GAGGCCGAGC 


TGCACGCCAC CCTGCTCTCC ATGGTGATGG TCTTCATCTT 


GTGTAGCGTG 


840 


CCCTATGCCA 


CCCTGGTCGT CTACCAGACT GTGCTCAATG TCCCTGACAC 


TTCCGTCTTC 


900 


TTGCTGCTCA 


CTGCTGTTTG GCTGCCCAAA GTCTCCCTGC TGGCAAACCC 


TGTTCTCTTT 


960 


CTTACTGTGA 


ACAAATCTGT CCGCAAGTGC TTGATAGGGA CCCTGGTGCA 


ACTACACCAC 


1020 


CGGTACAGTC 


GCCGTAATGT GGTCAGTACA GGGAGTGGCA TGGCTGAGGC 


CAGCCTGGAA 


1080 
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CCCAGCATAC GCTCGGGTAG CCAGCTCCTG GAGATGTTCC ACATTGGGCA GCAGCAGATC 1140 

TTTAAGCGCA CAGAGGATGA GGAAGAGAGT GAGGCCAAGT ACATTGGCTC AGCTGACTTC 1200 

CAGGCCAAGG AGATATTTAG CACCTGCCTC GAGGGAGAGC AGGGGCCACA GTTTGCGCCC 1260 

TCTGCCCCAC CCCTGAGCAC AGTGGACTCT GTATCCCAGG TGGCACCGGC AGCCCCTGTG 1320 

GAACCTGAAA CATTCCCTGA TAAGTATTCC CTGCAGTTTG GCTTTGGGCC TTTTGAGTTG 1380 

CCTCCTCAGT GGCTCTCAGA GACCCGAAAC AGCAAGAAGC GGCTGCTTCC CCCCTTGGGC 1440 

AACACCCCAG AAGAGCTGAT CCAGACAAAG GTGCCCAAGG TAGGCAGGGT GGAGCGGAAG 1500 

ATGAGCAGAA ACAATAAAGT GAGCATTTTT CCAAAGGTGG ATTCCTAG 1548 
(105) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 515 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Met Gly His Asn Gly Ser Trp lie Ser Pro Asn Ala Ser Glu Pro His 
1 5 10 15 

Asn Ala Ser Gly Ala Glu Ala Ala Gly Val Asn Arg Ser Ala Leu Gly 
20 25 30 

Glu Phe Gly Glu Ala Gin Leu Tyr Arg Gin Phe Thr Thr Thr Val Gin 
35 40 45 

Val Val lie Phe He Gly Ser Leu Leu Gly Asn Phe Met Val Leu Trp 
50 55 -60 

Ser Thr Cys Arg Thr Thr Val Phe Lys Ser Val Thr Asn Arg Phe He 
" 70 75 80 

Lys Asn Leu Ala Cys Ser Gly He Cys Ala Ser Leu Val Cys Val Pro 
85 90 95 

Phe Asp He He Leu Ser Thr Ser Pro His Cys Cys Trp Trp He Tyr 
100 . 105 110 

Thr Met Leu Phe Cys Lys Val Val Lys Phe Leu His Lys Val Phe Cys 
115 . 120 125 

Ser Val Thr He Leu Ser Phe Pro Ala He Ala Leu Asp Arg Tyr Tyr 
130 135 140 
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Ser Val Leu Tyr Pro Leu Glu Arg Lys He Ser Asp Ala Lys Ser Arg 

150 155 



Glu Leu val Met Tyr He Trp Ala His Ala Val Val Ala Ser Val 



165 



160 



Pro 



170 



175 



Val Phe Ala Val Thr Asn Val Ala Asp He Tyr Ala Thr Ser Thr Cys 

185 



190 



Thr Glu val Trp Ser Asn Ser Leu Gly His Leu Val Tyr Val Leu Val 



195 



200 



205 



Tyr Asn He Thr Thr Val lie Val Pro Val Val Val Val Phe Leu Phe 



215 



220 



Leu He Leu He Arg Arg Ala Leu Ser Ala Ser Gin Lys Lys Lys Val 



230 



235 



240 



He He Ala Ala Leu Arg Thr Pro Gin Asn Thr He Ser He Pro Tvr 
245 250 



Ala ser Gin Arg Glu Ala Glu Leu His Ala Thr Leu Leu Ser 



260 



265 



255 



Met Val 



270 



Met val Phe He Leu Cys Ser Val Pro Tyr Ala Thr Leu Val Val Tvr 
275 280 



285 



Gin Thr Val Leu Asn Val Pro Asp Thr Ser Val Phe Leu 



290 



295 

Ala Val Trp Leu Pro Lys Val Ser Leu Leu Ala Asn Pro Val 



Leu Leu Thr 



305 



Leu Phe 
320 



310 315 

Leu Thr Val Asn Lys Ser Val Arg Lys Cys Leu He Gly Thr Leu Val 

330 

Gin Leu His His Arg Tyr Ser Arg Arg Asn Val Val Ser Thr Gly Ser 

345 

Gly Met Ala Glu Ala Ser Leu Glu Pro Ser He Arg Ser Gly Ser Gin 
355 360 

Leu Leu Glu Met Phe His He Gly Gin Gin Gin He Phe Lys Pro Thr 

375 380 

Glu Asp Glu Glu Glu Ser Glu Ala Lys Tyr He Gly Ser Ala Asp Phe 



390 



395 400 
Gin Ala Lys Glu He Phe Ser Thr Cys Leu Glu Gly Glu Gin Gly Pro 



405 



410 415 
Gin Phe Ala Pro Ser Ala Pro Pro Leu Ser Thr Val Asp Ser Val 



420 



425 



Ser 



430 



Gin Val Ala Pro Ala Ala Pro Val Glu Pro Glu. Thr Phe Pro Asp Lys 
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435 440 445 

Tyr Ser Leu Gin Phe Gly Phe Gly Pro Phe Glu Leu Pro Pro Gin Trp 
450 455 460 

Leu Ser Glu Thr Arg Asn Ser Lys Lys Arg Leu Leu Pro Pro Leu Gly 
5 465 470 475 480 



Asn Thr Pro Glu Glu Leu He Gin Thr Lys Val Pro Lys Val Gly Arg 
485 490 495 

Val Glu Arg Lys Met Ser Arg Asn Asn Lys Val Ser He Phe Pro Lys 
500 505 510 

10 Val Asp Ser 

515 

(106) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 29 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
20 GGAGAATTCA CTAGGCGAGG CGCTCCATC 

(107) INFORMATION FOR SEQ ID NO:106: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
GGAGGATCCA GGAAACCTTA GGCCGAGTCC 
30 (108) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1164 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 107: 

ATGAATCGGC ACCATCTGCA GGATCACTTT CTCGAAATAG ACAAGAAGAA CTGCTGTGTG 60 

TTCCGAGATG ACTTCATTGC CAAGGTGTTG CCGCCGGTGT T6GGGCTGGA GTTTATCTTT 120 

GGGCTTCTGG GCAATGGCCT TGCCCTGTGG ATTTTCTGTT TCCACCTCAA GTCCTGGAAA 180 

5 TCCAGCCGGA TTTTCCTGTT CAACCTGGCA GTAGCTGACT TTCTACTGAT CATCTGCCTG 240 

CCGTTCQT6A TGGACTACTA TGT6CGGCGT TCAGACTG6A ACTTTGGGGA CATCCCTTGC 300 

CGGCTGGTGC TCTTCATGTT TGCCATGAAC CGCCAGGGCA GCATCATCTT CCTCACGGTG 360 

GTGGCGGTAG ACAGGTATTT CCGGGTGGTC.CATCCCCACC ACGCCCTGAA CAAOATCTCC 420 
AATTGGACAG CAGCCATCAT CTCTTGCCTT CTGTGGGGCA TCACTGTTGG CCTAACAGTC 480 
10 CACCTCCTGA AGAAGAAGTT GCTGATCCAG AATGGCCCTG CAAATGTGTG CATCAGCTTC 540 
AGCATCTGCC ATACCTTCCG GTGGCACGAA GCTATGTTCC TCCTGGAGTT CCTCCTGCCC 600 
CTGGGCATCA TCCTGTTCTG CTCAGCCAGA ATTATCTGGA GCCTGCGGCA GAGACAAATG 660 
GACCGGCAT6 CCAAGATCAA GAGAGCCATC ACCTTCATCA TGQTGGTGGC CATCGTCTTT 720 
GTCATCTGCT TCCTTCCCAG CGTGGTTGTG CGGATCCGCA TCTTCTGGCT CCTGCACACT 780 
15 TCGGGCACGC AGAATTGTGA AGTGTACCGC TCGGTGGACC TOGCGITCTT TATCACTCTC 840 
AGCTTCACCT ACATGAACAG CATGCTGGAC CCCGTGGT6T ACTACITCTC CAGCCCATCC 900 
TTTCCCAACT TCTTCTCCAC TTTGATCAAC CGCTGCCTCC AGAGGAA6AT GACAGGTGAG 960 
CCA6ATAATA ACCGCAGCAC GAGCGTCGAG CTCACAGGGG ACCCCAACAA AACCAGAGGC 1020 
GCTCCAOAGG CGTTAATGGC CAACTCCGGT GAGCCATGGA GCCCCTCTTA TCTGGGCCCA 1080 
20 ACCTCAAATA ACCATTCCAA GAAGGSACAT TGTCACCAAG AACCAGCATC TCTGGAGAAA 1140 
CAGTTGGGCT GTTGCATCGA GTAA 

1164 

(109) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 387 amino acids 

25 (B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
30 Met Asn Arg His His Leu Gin Asp His Phe Leu Glu He Asp Lys Lys 
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15 10 15 

Asn Cys Cys Val Phe Arg Asp Asp Phe He Ala Lys Val Leu Pro Pro 
20 25 30 

Val Leu Gly Leu Glu Phe He Phe Gly Leu Leu Gly Asn Gly Leu Ala 
35 40 45 

Leu Trp He Phe Cys Phe His Leu Lys Ser Trp Lys Ser Ser Arg He 
50 55 60 

Phe Leu Phe Asn Leu Ala Val Ala Asp Phe Leu Leu He He Cys Leu 
^5 70 75 80 

Pro Phe Val Met Asp Tyr Tyr Val. Arg Arg Ser Asp Trp Asn Phe Gly 
85 90 95 

Asp He Pro Cys Arg Leu Val Leu Phe Met Phe Ala Met Asn Arg Gin 
100 105 110 

Gly Ser He He Phe Leu Thr Val Val Ala Val Asp Arg Tyr Phe Arg 
115 120 125 

Val Val His Pro His His Ala Leu Asn Lys He Ser Asn Trp Thr Ala 
130 135 140 

Ala He He Ser Cys Leu Leu Trp Gly He Thr Val Gly Leu Thr Val 
145 150 155 160 

His Leu Leu Lys Lys Lys Leu Leu He Gin Asn Gly Pro Ala Asn Val 
165 170 175 

Cys He Ser Phe Ser He Cys His Thr Phe Arg Trp His Glu Ala Met 
180 185 190 

Phe Leu Leu Glu Phe Leu Leu Pro Leu Gly He He Leu Phe Cys Ser 
195 200 205 

Ala Arg He He Trp Ser Leu Arg Gin Arg Gin Met Asp Arg His Ala 
210 215 220 

Lys He Lys Arg Ala He Thr Phe He Met Val Val Ala He Val Phe 
225 230 235 240 " 

Val He Cys Phe Leu Pro Ser Val Val Val Arg lie Arg He Phe Trp 
245 250 255 

Leu Leu His Thr Ser Gly Thr Gin Asn Cys Glu Val Tyr Arg Ser Val 
260 265 270 

Asp Leu Ala Phe Phe He Thr Leu Ser Phe Thr Tyr Met Asn Ser Met 
275 280 285 

Leu Asp Pro Val Val Tyr Tyr Phe Ser Ser Pro Ser Phe Pro Asn Phe 
250 295 300 
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Phe Ser Thr Leu lie Asn Arg Cys Leu Gin Arg Lys Met Thr Gly Glu 
305 310 315 320 

Pro Asp Asn Asn Arg Ser Thr Ser Val Glu Leu Thr Gly Asp Pro Asn 
325 330 335 

5 Lys Thr Arg Gly Ala Pro Glu Ala Leu Met Ala Asn Ser Gly Glu Pro 

340 345 350 

Trp Ser Pro Ser Tyr Leu Gly Pro Thr Ser Asn Asn His Ser Lys Lys 
355 360 365 

Gly His Cys His Gin Glu Pro Ala Ser Leu Glu Lys Gin Leu Gly Cys 
10 370 375 380 

Cys lie Glu 
385 

(110) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
20 (iv) ANTI -SENSE: NO 

(xi) SEQtJENCE DESCRIPTION: SEQ ID NO: 109: 
ACCATGGCTT GCT^TGGCAG TGCGGCCAGG GGGCACT 37 

(111) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI- SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
C6ACCAGGAC AAACAGCATC TTGGTCACTT GTCTCCGGC 39 

(112) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (iv) ANTI -SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NOilll: 
GACCAAGATG CTGTTTGTCC TGGTCGTGGT GTTTGGCAT 39 
(X13) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
15 (iv) ANTI- SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112; 
CGGAATTCAG GATGGATCGG TCTCTTGCTG CGCCT 35 
(114) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 1212 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

ATGGCTTGCA ATGGCAGTGC GGCCAGGGGG CACTTTGACC CTGAGGACTT GAACCTGACT 60 

GACGAGGCAC TGAGACTCAA GTACCTGGGG CCCCAGCAGA CAGAGCTGTT CATGCCCATC 120 

TGTGCCACAT ACCTGCTGAT CTTCGTGGTG GGCGCTGTGG GCAATGGGCT GACCTGTCTG 180 

GTCATCCTGC GCCACAAGGC CATGCGCACG CCTACCAACT ACTACCTCTT CAGCCTGGCC 240 

30 GTGTCGGACC TGCTGGTGCT GCTGGTGGGC CTGCCCCTGG AGCTCTATGA GATGTGGCAC 300 

AACTACCCCT TCCTGCTGGG CGTTGGTGGC TGCTATTTCC GCACGCTACT GTTT<3AGATG 360 

GTCTGCCTGG CCTCAGTGCT CAACGTCACT GCCCTGAGCG TGGAACGGTA TGTGGCCGTG 420 

GTGCACCCAC TCCAGGCCAG GTCCATGGTG ACGCGGGCCC ATGTGCGCCG AGTGCTTGGG 480 
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GCCGTCTGGG GTCTTGCCAT GCTCTCCTCC Ca<3CCCAACA CCA6CCTGCA CGGCATCCGG 540 

CAGCTGCACG TGCCCTGCCG GGGCCCAGTG rCAGACTCAG CTGTTTGCAT GCT6GTCCGC 600 

CCACGGGCCC TCTACAACAT GGTAGTGCAG ACCACCGCGC TGCTCTTCTT CTGCCTGCCC 660 

ATGGCCATCA TGAGCGTGCT CTACCTGCTC ATTGGGCTGC GACTGCGGCG GGAGAGGCTG 720 

CTGCTCATGC AGGA6GCCAA GGGCAGGGGC TCTGCAGCAG CCAGGTCCAG ATACACCTGC 780 

AGGCTCCAGC A6CACGATCG GGGCCGGA6A CAAGTGACCA AGATGCTGTT TGTCCTGGTC 840 

GTGGTGTTTG GCATCTGCTG GGCCCCGTTC CACGCCGACC GCGTCATGTG GAGCGTCGTC 900 

TCACAGTGGA CAGATGGCCT GCACCT66CC TTCCAGCACG TGCACGTCAT CTCCGGCATC 960 

TTCTTCTACC TGGGCTC6GC G6CCAACCCC GTGCTCTATA GCCTCATGTC CAGCCGCTTC 1020 

CGAGAGACCr TCCAGGAGGC CCTGTGCCTC GGGGCCTGCT GCCATCGCCT CAGACCCCGC 1080 

CACAGCTCCC ACAGCCTCAG CAGGATGACC ACAGGCAGCA CCCTGTGTGA TGTCGGCTCC 1140 

CTGGGCAGCT GGGTCCACCC CCTGGCT6GG AACGATGGCC CAGAGQCGCA GCAAGA6ACC 1200 
GATCCATCCT GA 

1212 

(115) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 403 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

Met Ala Cys Asn Gly Ser Ala Ala Arg Gly His Phe Asp Pro Glu Asp 
^ 10 15 

Leu Asn Leu Thr Asp Glu Ala Leu Arg Leu Lys Tyr Leu Gly Pro Gin 
20 25 



30' 



Gin Thr Glu Leu Phe Met Pro He Cys Ala Thr Tyr Leu Leu He Phe 
35 40 45 

Val val Gly Ala Val Gly Asn Gly Leu Thr Cys Leu Val He Leu Arg 
50 55 » 



60 



His Lys Ala Met Arg Thr Pro Thr Asn Tyr Tyr Leu Phe Ser Leu Ala 
" ^° 75 80 ■ 

Val Ser Asp Leu Leu Val Leu Leu Val Gly Leu Pro Leu Glu Leu Tyr 
85 90 95 
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Glu Met Trp His Asn Tyr Pro Phe Leu Leu Gly Val Gly Gly Cys Tyr 
100 105 110 

Phe Arg Thr Leu Leu Phe Glu Met Val Cys Leu Ala Ser Val Leu Asn 
115 120 125 

5 ■ Val Thr Ala Leu Ser Val Glu Arg Tyr Val Ala Val Val His Pro Leu 

130 135 140 

Gin Ala Arg Ser Met .Val Thr Arg Ala His Val Arg Arg Val Leu Gly 
145 150 155 160 

Ala Val Trp Gly Leu Ala Met Leu Cys Ser Leu Pro Asn Thr Ser Leu 
10 165 170 175 

His Gly lie Arg Gin Leu His Val Pro Cys Arg Gly Pro Val Pro Asp 
180 185 190 

Ser Ala Val Cys Met Leu Val Arg Pro Arg Ala Leu Tyr Asn Met Val 
195 200 205 

15 Val Gin Thr Thr Ala Leu Leu Phe Phe Cys Leu Pro Met Ala lie Met 

210 215 220 

Ser Val Leu Tyr Leu Leu lie Gly Leu Arg Leu Arg Arg Glu Arg Leu 
225 230 235 240 

Leu Leu Met Gin Glu Ala Lys Gly Arg Gly Ser Ala Ala Ala Arg Ser 
20 245 250 255 

Arg Tyr Thr Cys Arg Leu Gin Gin His Asp Arg Gly Arg Arg Gin Val 
260 265 270 

Thr Lys Met Leu Phe Val Leu Val Val Val Phe Gly lie Cys Trp Ala 
275 280 285 

25 Pro Phe His Ala Asp Arg Val Met Trp Ser Val Val Ser Gin Trp Thr 

290 295 300 

Asp Gly Leu His Leu Ala Phe Gin His Val His Val lie Ser Gly He 
305 310 315 320 

Phe Phe Tyr Leu Gly Ser Ala Ala Asn Pro Val Leu Tyr Ser Leu Met 
30 325 330 335 

Ser Ser Arg Phe Arg Glu Thr Phe Gin Glu Ala .Leu Cys Leu Gly Ala 
340 345 350 

Cys Cys His Arg Leu Arg Pro Arg His Ser Ser His Ser Leu Ser Arg 
355 360 365 

35 Met Thr Thr Gly Ser Thr Leu Cys Asp Val Gly Ser Leu Gly Ser Trp 

370 375 380 



Val His Pro Leu Ala Gly Asn* Asp Gly Pro Glu Ala Gin Gin Glu Thr 



wo 00/22129 



PCT/US99/23938 



10 



15 



20 



395 400 



30 
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385 390 
Asp Pro Ser 
(116) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

GGAAGCTTCA GGCCCAAAGA TGGGGTUICAT 

(117) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

GTGGATCCAC CCGCGGAGGA CCCAGGCTAG 

(118) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1098 base pairs 

25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 117: 

30 ATGGGGAACA TCACTGCAGA CAACTCCTCG ATGAGCTGTA CCATCGACCA TACCATCCAC 60 

CAGACGCTGG CCCCGGTGGT CTATGTTACC GTGCTGGTGG TGGGCTTCCC GGCCAACTGC 120 

CTGTCCCTCT ACTTCGGCTA CCTGCAGATC AAGGCCCGGA ACGAGCTGGG CGTGTACCTG 180 

TGCAACCTGA CGGTGGCCGA CCTCTTCTAC ATCTGCTCGC TCCCCTTCTG GCTGCAGTAc' 240 

GTGCTGCAGC ACGACAACTG GTCTCACGGC GACCTGTCCT GCCAGGTGTG CGGCATCCTC 300 

35 CTGTACGAGA ACATCTACAT CAGCGTGGGC TTCCTCTGCT GCATCTCCGT GGACCGCTAC 360 



30 
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CTGGCTGTGG CCCATCCCTT CCGCTTCGAC CAGTTCCGGA CCCTGAAGGC GGCCGTCGGC 420 
GTCAGCGTGG TCATCTGGGC CAAGGAGCTG CTGACCAGCA TCTACTTCCT GATGCACGAG 480 
GAGGTCATCG AGGACGAGAA CCAGCACCGC GTGTGCTTTG AGCACTACCC CATCCAGGCA 540 
TGGCAGCGCG CCATCAACTA CTACCGCTTC CTGGTGGGCT TCCTCTTCCC CATCTGCCTG 600 
5 CTGCTGGCGT CCTACCAGGG CATCCTGCGC GCCGTGCGCC GGAGCCACGG CACCCAGAAG 660 
AGCCGCAAGG ACCAGATCCA GCGGCTGGTG CTCAGCACCG TGGTCATCTT CCTGGCCTGC 720 
TTCCTGCCCT ACCACGTGTT GCTGCTGGTG CGCAGCGTCT GGGAGGCCAG CTGCGACTTC 780 
GCCAAGGGCG TTTTCAACGC CTACCACTTC TCCCTCCTGC TCACCAGCTT CAACTGCGTC 840 
GCCGACCCCG TGCTCTACTG CTTCGTCAGC GAGACCACCC ACCGGGACCT GGCCCGCCTC 900 
10 CGCGGGGCCT GCCTGGCCTT CCTCACCTGC TCCAGGACCG GCCGGGCCAG GGAGGCCTAC 960 

CCGCTGGGTG CCCCCGAGGC CTCCGGGAAA AGCGGGGCCC AGGGTGAGGA GCCCGAGCTG 1020 

TTGACCAAGC TCCACCCGGC CTTCCAGACC CCTAACTCGC CAGGGTCGGG CGGGTTCCCC 1080 

ACGGGCAGGT TGGCCTAG 1098 
(119) INFORMATION FOR SEQ ID NO; 118: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 365 amino acids 

(B) TYPE: amino acid 
{ C ) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

20 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Met Gly Asn lie Thr Ala Asp Asn Ser Ser Met Ser Cys Thr lie Asp 
1 5 10 15 

His Thr lie His Gin Thr Leu Ala Pro Val Val Tyr Val Thr Val Leu 
25 20 25 30 

Val Val Gly Phe Pro Ala Asn Cys Leu Ser Leu Tyr Phe Gly Tyr Leu 
35 40 45 

Gin lie Lys Ala Arg Asn Glu Leu Gly Val Tyr Leu Cys Asn Leu Thr 
50 55 60 

30 Val Ala Asp Leu Phe Tyr lie Cys Ser Leu Pro Phe T£p Leu Gin Tyr 

65 70 75 80 

Val Leu Gin His Asp Asn Trp Ser His Gly Asp Leu Ser Cys Gin Val 
85 90 95 
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Cys Gly lie Leu Leu Tyr Clu Asn lie ryr He Ser Val Gly Phe Leu . 



3 



105 



110 



cys cys lie Ser Val Asp Arg Tyr l.eu Ala Val Ala His Pro Phe Arg 

120 3^25 
Phe His Gin Phe Arg xhr Leu Lys Ala Ala Val Gly Val Ser Val Val 



135 



140 



lie Trp Ala Lys Glu Leu Leu Thr Ser He Tyr Phe Leu Met His Glu 



150 



155 



160 



Glu val He Glu Asp Glu Asn Gin His Arg Val Cys Phe Glu His Tyr 

Pro lie Gin Ala Trp Gin Arg Ala He Asn Tyr Tyr Arg Phe Leu Val 

Gly Phe Leu Phe Pro He Cys Leu Leu Leu Ala Ser Tyr Gin Gly He 

205 

Leu Arg Ala Val Arg Arg Ser His Gly Thr Gin Lys Ser Arg Lys Asp 

^-^^ 220 

Oln He Gin Arg Leu Val Leu Ser Thr Val Val He Phe Leu Ala Cys 

"° 235 
Phe Leu Pro Tyr His Val Leu Leu Leu Val Arg Ser Val Trp Glu Ala 
245 250 255 

ser cys Asp Phe Ala Lys Gly Val Phe Asn Ala Tyr His Phe Ser Leu 

270 

Leu Leu Thr Ser Phe Asn Cys Val Ala Asp Pro Val Leu Tyr Cys Phe 

-280 285 
V.1 S.r alu Thr His «p ^ ^ 

300 

Leu Ala Phe Leu Thr Cys Ser Arg Thr Gly Arg Ala Arg Glu Ala Tyr 

320 

pro Leu Gly Ala Pro Glu Ala Ser Gly Lys Ser Gly Ala Gin Gly Glu 
325 

Glu Pro Glu I.U Leu Thr Lys Leu His Pro Ala Phe Gin Thr Pro Asn 

350 

ser Pro Gly Ser Gly Gly Phe Pro Thr Gly Arg Leu Ala 



.360 



365 



(120) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 
GACCTC6AGT CCTTCTACAC CTCATC 26 

(121) INFORMATION FOR SEQ ID NO: 120 i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:120: • 
TGCTCTAGAT TCCAGATAGG TGAAAACTTG 30 

(122) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1416 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

ATGGATATTC TTTGTGAAGA AAATACTTCT TTGAGCTCAA CTACGAACTC CCTAATGCAA 60 

TTAAATGATG ACAACAGGCT CTACAGTAAT GACTTTAACT CCGGAGAAGC TAACACTTCT 120 

GATGCATTTA ACTGGACAGT CGACTCTGAA AATCGAACCA ACCTTTCCTG TGAAGGGTGC 180 

CTCTCACCGT CGTGTCTCTC CTTACTTCAT CTCCAGGAAA AAAACTGGTC TGCTTTACTG 240 

ACAGCCGTAG TGATTATTCT AACTATTGCT GGAAACATAC TCGTCATCAT GGCAGTGTCC 300 

CTAGAGAAAA AGCTGCAGAA TGCCACCAAC TATTTCCTGA TGTCACTTGC CATAGCTGAT 360 

ATGCTGCTGG GTTTCCTTGT CATGCCCGTG TCCATGTTAA CCATCCTGTA TGGGTACCGG 420 

TGGCCTCTGC CGAGCAAGCT TTGTGCAGTC TGGATTTACC TGGACGTGCT CTTCTCCACG 480 

GCCTCCATCA TGCACCTCTG CGCCATCTCG CTGGACCGCT ACGTCGCCAT CCAGAATCCC 540 

ATCCACCACA GCCGCTTCAA CTCCAGAACT AAGGCATTTC TGAAAATCAT TGCTGTTTGG 600 



900 
960 
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ACCATATCAG TAGGTATATC CATGCCAATA CCAGTCTTX^ GGCTACAGGA CGATTCGAAG 660 

GTCTTTAAGG AGGGGAGTTG CTTACTCGCC GATGATAACT TTGTCCTGAT CGGCTC^ETTT 720 

GTGTCArrTT TCATTCCCTT AACCATCATG GTGATCACCT ACTTTCTAAC TATCAAGTCA 780 

CTCCAGAAAG AAGCTACTTT G1X3TGTAAGT GATCTTOGCA CACGGGCCAA ATTAGCTTCT 840 

ITCAGCTTCC TCCCTCAGAG TTCTTTGTCT TCAGAAAAGC TCTTCCAGCG GTCGATCCAT 

AGGGAGCCAG GGTCCTACAC AGGCAGGAGG ACTATGCAGT CCATCAGCAA I^GCAAAAG 

GCATGCAAGG TGCTGGGCAT CGTCTTCTTC CTGTITGTGG TGATGTGGTG CCCTTTCTTC 1020 

ATCACAAACA TCATGGCCGT CATCTGCAAA GAGTCCTGCA AT^GGATGT CATTOGGGCC 1080 

CTGCTCAATG TGTTTG™ GATCGGrTAT CTCTCTTCAG CAGTCAACCC ACTAGTCTAC 1140 

ACACTGTTCA ACAAGACCTA TAGGTCAGCC TTTTCACGGT ATATTCAGTG TCAGTACAAG 1200 

SAAAACAAAA AACCATTGCA GTTAATITTA GTGAACACAA TACCGGCnT GGCCTACAAG 1260 

TCTAGCCAAC TTCAAAl^ ACAAAAAAAG AATTCAAAGC AAGATG^^ ,,,, 

AATGACT^CTCAATGGTTGCTCTAGGAAAGCAGTATTCTGAAGAGGCTTCTAA^^ 1380 
AGCQACGGAG TGAATGAAAA GGTGAGCTGT GTGTGA 

1416 

(123) INFORMATION FOR SEQ ID NO : 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 471 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY s not relevant 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:122: 

Met Asp He Leu Cya Glu Glu Asn Thr Ser Leu Ser Ser Thr Thr Asn 

^ " 15 

ser Leu Met Gin Leu Asn Asp Asp Asn Arg Leu ^r Ser Asn Asp Phe 

25 30 

Asn ser Gly Glu Ala Asn Thr Ser Asp Ala Phe Asn Trp Thr Val Asp 

45 

ser Glu Asn Arg 0^ Asn Leu Ser Cys Glu Gly Cys Leu Ser Pro Ser 



60 



cys Leu ser Leu Leu His Leu Gin Glu Lys Asn Trp Ser Ala Leu Leu 

80 
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Thr Ala Val Val He lie Leu Thr He Ala Gly Asn He Leu Val He 
85 90 95 

Met Ala Val Ser Leu Glu Lys Lys Leu Gin Asn Ala Thr Asn Tyr Phe 
100 105 110 

Leu Met Ser Leu Ala He Ala Asp Met Leu Leu Gly Phe Leu Val Met 
115 120 125 

Pro Val Ser Met Leu Thr He Leu Tyr Gly Tyr Arg Trp Pro Leu Pro 
130 135 140 



10 



Ser Lys Leu Cys Ala Val Trp He Tyr Leu Asp Val Leu Phe Ser Thr 
145 150 155 160 



Ala Ser He Met His Leu Cys Ala He Ser Leu Asp Arg Tyr Val Ala 
165 170 175 



15 



He Gin Asn Pro He His His Ser Arg Phe Asn Ser Arg Thr Lys Ala 
180 185 190 

Phe Leu Lys He He Ala Val Trp Thr He Ser Val Gly He Ser Met 

195 200 205 



Pro He Pro Val Phe Gly Leu Gin Asp Asp Ser Lys Val Phe Lys Glu 
210 215 220 



20 



Gly Ser Cys Leu Leu Ala Asp Asp Asn Phe Val Leu He Gly Ser Phe 
225 230 235 240 



Val Ser Phe Phe He Pro Leu Thr He Met Val He Thr Tyr Phe Leu 
245 250 255 



25 



Thr .He Lys Ser Leu Gin Lys Glu Ala Thr Leu Cys Val Ser Asp Leu 

260 265 270 

Gly Thr Arg Ala Lys Leu Ala Ser Phe Ser Phe Leu Pro Gin Ser Ser 

275 280 285 



Leu Ser Ser Glu Lys Leu Phe Gin Arg Ser He His Arg Glu Pro Gly 
290 295 300 



30 



Ser Tyr Thr Gly Arg Arg Thr Met Gin Ser He Ser Asn Glu Gin Lys 
305 310 315 320 



Ala Cys Lys Val Leu Gly He Val Phe Phe Leu Phe Val Val Met Trp 
325 330 335 



35 



Cys Pro Phe Phe He Thr Asn He Met Ala Val He Cys Lys Glu Ser 
340 345 

Cys Asn Glu Asp Val He Gly Ala Leu Leu Asn Val Phe Val Trp He 
355 360 365 



Gly Tyr Leu Ser Ser Ala Val Asn Pro Leu Val Tyr Thr Leu Phe Asn 
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375 380 

Lys Thr Tyr Arg Ser Ala Phe Ser Arg Tyr He Gin Cys Gin Tyr Lys 

395 400 

Glu Asn Lys Lys Pro Leu Gin Leu He Leu Val Asn Thr He Pro Ala 
405 43^5 

Leu Ala Tyr Lys Ser Ser Gin Leu Gin Met Gly Gin Lys Lys Asn Ser 

425 430 

Lys Gin Asp Ala Lys Thr Thr Asp Asn Asp Cys Ser Met Val Ala Leu 
435 440 

Gly Lys Gin Tyr Ser Glu Glu Ala Ser Lys Asp Asn Ser Asp Gly Val 
450 455 

Asn Glu Lys Val Ser Cys Val 
465 470 

(124) INFORMATION FOR SEQ ID NO: 12 3: 

(i) SEQUENCE CHTOUVCTERISTICS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
GACCTCGAGG TTGCTTAAGA CTGAAGC 
(125) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: ' 

ATTTCTAGAC ATATGTAGCT TGTACCG 
(126) INFORMATION FOR SEQ ID NO:125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1377 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



27 



27 
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(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 



ATGGTGAACC 


TGAGGAATGC 


GGTGCATTCA TTCCTTGTGC ACCTAATTGG 


CCTATTGGTT 


60 


TGGCAATGTG 


ATATTTCTGT 


GAGCCCAGTA GCAGCTATAG TAACTGACAT 


TTTCAATACC 


120 


TCCGATGGTG 


GACGCTTCAA ATTCCCAGAC GGGGTACAAA ACTGGCCAGC ACTTTCAATC 


180 


GTCATCATAA 


TAATCATGAC 


AATAGGTGGC AACATCCTTG TGATCATGGC 


AGTAAGCATG 


240 


GT^AAAGAAAC 


TGCACAATGC 


CACCAATTAC TTCTTAATGT CCCTAGCCAT 


TGCTGATATG 


300 


CTAGTGGGAC 


TACTTGTCAT 


GCCCCTGTCT CTCCTGGCAA TCCTTTATGA 


TTATGTCTGG 


360 


CCACTACCTA 


GATATTTGTG 


CCCCGTCTGG ATTTCTTTAG ATGTTTTATT 


TTCAACAGCG 


420 


TCCATCATGC 


ACCTCTGCGC 


TATATCGCTG GATCGGTATG TAGCAATACG 


TAATCCTATT 


480 


GAGCATAGCC 


GTTTCAATTC 


GCGGACTAAG GCCATCATGA AGATTGCTAT 


TGTTTGGGCA 


540 


ATTTCTATAG 


GTGTATCAGT 


TCCTATCCCT GTGATTGGAC TGAGGGACGA AGAAAAGGTG 


600 


TTCGTGAACA ACACGACGTG 


CGTGCTCAAC GACCCAAATT TCGTTCTTAT 


TGGGTCCTTC 


660 


GTAGCTTTCT 


TCATACCGCT 


GACGATTATG GTGATTACGT ATTGCCTGAC 


CATCTACGTT 


720 


CTGCGCCGAC 


AAGCTTTGAT 


GTTACTGCAC GGCCACACCG AGGAACCGCC 


TGGACTAAGT 


780 


CTGGATTTCC 


TGAAGTGCTG 


CAAGAGGAAT ACGGCCGAGG AAGAGAACTC 


TGCAAACCCT 


840 


AACCAAGACC 


AGAACGCACG 


CCGAAGAAAG AAGAAGGAGA GACGTCCTAG 


GGGCACCATG 


900 


CAGGCTATCA 


ACAATGAAAG 


AAAAGCTTCG AAAGTCCTTG GGATTGTTTT 


CTTTGTGTTT 


960 


CTGATCATGT 


GGTGCCCATT 


TTTCATTACC AATATTCTGT CTGTTCTTTG 


TGAGAAGTCC 


1020 


TGTAACCAAA AGCTCATGGA AAAGCTTCTG AATGTGTTTG TTTGGATTGG 


CTATGTTTGT 


1080 


TCAGGAATCA 


ATCCTCTGGT 


GTATACTCTG TTCAACAAAA TTTACCGAAG 


GGCATTCTCC 


1140 


AACTATTTGC 


GTTGCAATTA 


TAAGGTAGAG AAAAAGCCTC CTGTCAGGCA 


GATTCCAAGA 


1200 


GTTGCCGCCA 


CTGCTTTGTC 


TGG6AGGGAG CTTAATGTTA ACATTTATCG 


GCATACCAAT 


1260 


GAACCGGTGA 


TCGAGAAAGC 


CAGTGACAAT GAGCCCX3GTA TAGAGATGCA AGTTGAGAAT 


1320 


TTAGAGTTAC 


CAGTAAATCC 


CTCCAGTGTG GTTAGCGAAA GGATTAGCAG 


TGTGTGA 


1377 



(127) INFORMATION FOR SEQ ID NO:126: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 458 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECXILE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

Met Val Asn Leu Arg Asn Ala Val His Ser Phe Leu Val His Leu He 
^ 10 15 

Gly Leu Leu Val Trp Gin Cys Asp He Ser Val Ser Pro Val Ala Ala 
^° 25 30 

lie val Thr Asp He Phe Asn Thr Ser Asp Gly Gly Arg Phe Lys Phe 
35 40 45 

Pro Asp Gly Val Gin Asn Trp Pro Ala Leu Ser He Val He He He 

55 60 

He Met Thr He Gly Gly Asn He Leu Val He Met Ala Val Ser Met 

'° 75 80 

Glu Lys Lys Leu His Asn Ala Thr Asn Tyr Phe Leu Met Ser Leu Ala 
85 90 95 

He Ala Asp Met Leu Val Gly Leu Leu Val Met Pro Leu Ser Leu Leu 
100 105 



110 



Ala He Leu Tyr Asp Tyr Val Trp Pro Leu Pro Arg Tyr Leu Cys Pro 
115 120 125 

val Trp He Ser Leu Asp Val Leu Phe Ser Thr Ala Ser He Met His 

135 140 

Leu Cys Ala He Ser Leu Asp Arg Tyr Val Ala He Arg Asn Pro He 

155 160 

Glu His Ser Arg Phe Asn Ser Arg Thr Lys Ala He Met Lys He Ala 
165 170 

He val Trp Ala He Ser He Gly Val Ser Val Pro He Pro Val He 
180 185 190 

Gly Leu Arg Asp Glu Glu Lys Val Phe Val Asn Asn Thr Thr Cys Val 
195 200 



205 



Leu Asn ASP Pro Asn Phe Val Leu He Gly Ser Phe Val Ala Phe Phe 

215 220 

He Pro Leu Thr He Met Val He Thr Tyr Cys Leu Thr He Tyr Val 
225 230 235 



240 



Leu Arg Arg Gin Ala Leu Met Leu Leu His Gly His Thr Glu Glu Pro 
245 250 



255 
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Pro Gly Leu Ser Leu Asp Phe Leu Lys Cys Cys Lys Arg Asn Thr Ala 
260 265 270 

Glu Glu Glu Asn Ser Ala Asn Pro Asn Gin Asp Gin Asn Ala Arg Arg 
275 280 285 

Arg Lys Lys Lys Glu Arg Arg Pro Arg Gly Thr Met Gin Ala lie Asn 
290 295 300 

Asn Glu Arg Lys Ala Ser Lys Val Leu Gly He Val Phe Phe Val Phe 

310 315 320 

Leu He Met Trp Cys Pro Phe Phe He Thr Asn He Leu Ser Val Leu 
325 330 335 

Cys Glu Lys Ser Cys Asn Gin Lys Leu Met Glu Lys Leu Leu Asn Val 
340 345 35Q 

Phe Val Trp He Gly Tyr Val Cys Ser Gly He Asn Pro Leu Val Tyr 
355 360 365 

Thr Leu Phe Asn Lys He Tyr Arg Arg Ala Phe Ser Asn Tyr Leu Arg 
370 375 380 

f 

Cys Asn Tyr Lys Val Glu Lys Lys Pro Pro Val Arg Gin lie Pro Arg 

390 395 400 

Val Ala Ala Thr Ala Leu Ser Gly Arg Glu Leu Asn Val Asn He Tyr 
405 410 415 

Arg His Thr Asn Glu Pro Val He Glu Lys Ala Ser Asp Asn Glu Pro 
420 425 430 

Gly He Glu Met Gin Val Glu Asn Leu Glu Leu Pro Val Asn Pro Ser 
435 440 445 

Ser Val Val Ser Glu Arg He Ser Ser Val 
450 455 

(128) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRJ\NDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 
GGTAAGCTTG GCAGTCCACG CCAGGCCTTC 

(129) INFORMATION FOR SEQ ID NO: 12 8: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

TCCGAATTCT CTGTAGACAC AAGGCTTTGG 
(130) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1068 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

ATGGATCAGT TCCCTGAATC AGTGACAGAA AACTTTGAGT ACGATGATTT GGCTGAGGCC 60 

TGTTATATTG GGGACATCGT GGTCTTTGGG ACTGTGTTCC TGTCCATATT CTACTCCGTC 120 

ATCTTTGCCA TTGGCCTGGT GGGAAATTTG TTGGTAGTGT TTGCCCTCAC CAACAGCAAG 180 

20 AAGCCCAAGA GTGTCACCGA CATTTACCTC CTGAACCTGG CCTTGTCTGA TCTGCTGTTT 240 

GTAGCCACTT TGCCCTTCTG GACTCACTAT TTGATAAATG AAAAGGGCCT CCACAATGCC 300 

ATGTGCAAAT TCACTACCGC CTTCTTCTTC ATCGGCTTTT TTGGAAGCAT ATTCTTCATC 360 

ACCGTCATCA GCATTGATAG GTACCTGGCC ATCGTCCTGG CCGCCAACTC CATGAACAAC 420 

CGGACCGTGC AGCATGGCGT CACCATCAGC CTAGGCGTCT GGGCAGCAGC CATTTTGGTG 480 

25 GCAGCACCCC AGTTCATGTT CACAAAGCAG AAAGAAAATG AATGCCTTGG TGACTACCCC 54 0 

GAGGTCCTCC AGGAAATCTG GCCCGTGCTC CGCAATGTGG AAACAAATTT TCTTGGCTTC 600 

CTACTCCCCC TGCTCATTAT GAGTTATTGC TACTTCAGAA TCATCCAGAC GCTGTTTTCC 660 

TGCAAGAACC ACAAGAAAGC CAAAGCCATT AAACTGATCC TTCTGGTGGT CATCGTGTTT 720 

TTCCTCTTCT GGACACCCTA CAACGTTATG ATTTTCCTGG AGACGCTTAA GCTCTATGAC 780 

30 TTCTTTCCCA GTTGTGACAT GAGGAAGGAT CTGAGGCTGG CCCTCAGTGT GACTGAGACG 840 

GTTGCATTTA GCCATTGTTG CCTGAATCCT CTCATCTATG CATTTGCTGG . GGAGAAGTTC 900 

AGAAGATACC TTTACCACCT GTATGGGAAA TGCCTGGCTG TCCTGTGTGG GCGCTCAGTC 960 
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CACGTTGATT TCTCCTCATC TGAATCACAA AGGAGCAGGC ATGGAAGTGT TCTGAGCAGC 1020 
AATTTTACTT ACCACACGAG TGATGGAGAT GCATTGCTCC TTCTCTGA 1068 
(131) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Met Asp Gin Phe Pro Glu Ser Val Thr Glu Asn Phe Glu Tyr Asp Asp 
15 10 15 

Leu Ala Glu Ala Cys Tyr lie Gly Asp lie Val Val Phe Gly Thr Val 
20 25 30 

Phe Leu Ser lie Phe Tyr Ser Val lie Phe Ala He Gly Leu Val Gly 
35 .40 45 

Asn Leu Leu Val Val Phe Ala Leu Thr Asn Ser Lys Lys Pro Lys Ser 
50 55 60 

Val Thr Asp He Tyr Leu Leu Asn Leu Ala Leu Ser Asp Leu Leu Phe 
65 70 75 80 

Val Ala Thr Leu Pro Phe Trp Thr His Tyr Leu He Asn Glu Lys Gly 
85 90 95 

Leu His Asn Ala Met Cys Lys Phe Thr Thr Ala Phe Phe Phe He Gly 
100 105 110 

Phe Phe Gly Ser He Phe Phe He Thr Val He Ser He Asp Arg Tyr 
115 120 125 

Leu Ala He Val Leu Ala Ala Asn Ser Met Asn Asn Arg Thr Val Gin 
130 135 140 

His Gly Val Thr He Ser Leu Gly Val Trp Ala Ala Ala He Leu Val 
145 150 155 160 

Ala Ala Pro Gin Phe Met Phe Thr Lys Gin Lys Glu Asn Glu Cys Leu 
165 170 175 

Gly Asp Tyr Pro Glu Val Leu Gin Glu He Trp Pro Val Leu Arg Asn 
180 185 190 



Val 



Glu Thr Asn Phe Leu Gly Phe Leu Leu Pro Leu Leu He 
195 200 205 



Met Ser 
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Tyr Cys Tyr Phe Arg He He Gin Thr Leu Phe Ser Cys Lys Asn His 
210 215 220 

Lys Lys Ala Lys Ala He Lys Leu He Leu Leu Val Val He Val Phe 
225 230 235 240 

Phe Leu Phe Trp Thr Pro Tyr Asn Val Met He Phe Leu Glu Thr Leu 
245 250 255 

Lys Leu Tyr Asp Phe Phe Pro Ser Cys Asp Met Arg Lys Asp Leu Arg 
260 265 270 

Leu Ala Leu Ser Val Thr Glu Thr Val Ala Phe Ser His Cys Cys Leu 
275 280 285 

Asn Pro Leu He Tyr Ala Phe Ala Gly Glu Lys Phe Arg Arg Tyr Leu 
290 295 300 

Tyr His Leu Tyr Gly Lys Cys Leu Ala Val Leu Cys Gly Arg Ser Val 
305 310 315 320 

His Val Asp Phe Ser Ser Ser Glu Ser Gin Arg Ser Arg His Gly Ser 
325 330 335 

Val Leu Ser Ser Asn Phe Thr Tyr His Thr Ser Asp Gly Asp Ala Leu 

340 345 350 

Leu Leu Leu 
355 

(132) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQXJENCE DESCRIPTION: SEQ ID N0:131: 
GATCTCCAGT AGGCATAAGT GGACAATTCT GG 32 

(133) INFORMATION FOR SEQ ID N0:132: 



(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 
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CTCCTTCGGT CCTCCTATCG TTOTGAGAAG 

(134) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 3: 
AGAAGGCCAA GATCGCGCGG CTGGCCCTCA 

(135) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 
CGGCGCCACC GCACGAAAAA GCTCATCTTC 

(136) INFORMATION FOR SEQ ID NO:135: • 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi)* SEQUENCE DESCRIPTION: SEQ ID NO:135: 
GCCAAGAAGC GGGTGAAGTT CCTGGTGGTG GCA 
(137) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 
CAGGCGGAAG GTGAAAGTCC TGGTCCTCGT 
(138) INFORMATION FOR SEQ ID NO: 13 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 
CGGCGCCTGC GGGCCAAGCG GCTGGTGGTG GTG 

(139) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 
CCAAGCACAA AGCCAAGAAA GTGACCATCA C 

(140) INFORMATION FOR SEQ ID NO: 13 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 9: 
GCGCCGGCGC ACCAAATGCT TGCTGGTGGT 

(141) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: .DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 
CAAAAAGCTG AAGAAATCTA AGAAGATCAT CTTTATTGTC G 

(142) INFORMATION FOR SEQ ID N0:141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 
CAAGACCAAG GCAAAACGCA TGATCGCCAT 

(143) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

{ C ) STRANDEDNESS : s ingle 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 
GTCAAGGAGA AGTCCAAAAG GATCATCATC 

(144) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 3: 
CGCCGCGTGC GGGCCAAGCA GCTCCTGCTC 

(145) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 
CCTGATAAGC GCTATAAAAT GGTCCTGTTT CGA 
(146) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 
GAAAGACAAA AGAGAGTCAA GAGGATGTCT TTATTG 
(147) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 
1^ (A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



20 



25 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 
CGGAGAAAGA GGGTGAAACG CACAGCCATC GCC 
(148) INFORMATION FOR SEQ ID NO: 14 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY:, linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) * SEQUENCE DESCRIPTION: SEQ ID NO: 147; 
30 AAGCTTCAGC GGGCCAAGGC ACTGGTCACC 

(149) INFORMATION FOR SEQ ID NO:148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



33 
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(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:148: 
CAGCGGCAGA AGGC^-AAAAG GGTGGCCATC 

(150) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 
CGGCAGAAGG CGAAGCGCAT GATCCTCGCG 

(151) INFORMATION FOR SEQ ID NO:150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECXJLE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 
GAGCGCAACA AGGCCAAA7VA GGTGATCATC 

(152) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 
GGTGTAAACA AAAAGGCTAA AAACACAATT ATTCTTATT 

(153) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 
GAGAGCCAGC TCAAGAGCAC CGTGGTG 

(154) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:153: 
CCACAAGCAA ACCAAGAAAA TGCTGGCTGT 

(155) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 
CATCAAGTGT ATCATGTGCC AAGTACGCCC 

(156) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 
CTAGAGAGTC AGATGAAGTG TACAGTAGTG GCAC 

(157) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 
CGGACAAAAG TGAAAACTAA AAAGATGTTC CTCATT 

(158) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 
GCTGAGGTTC GCAATAAACT AACCATGTTT GTG 

(159) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 
GGGAGGCCGA GCTGAAAGCC ACCCTGCTC 

(160) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 9: 
CAAGATCAAG AGAGCCAAAA CCTTCATCAT G 

(161) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic)' 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 
CCGGAGACAA GTGAAGJ^AGA TGCTGTTTGT C 

(162) INFORMATION FOR SEQ ID NO: 161: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

*0 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 
GCAAGGACCA GATCT^GCGG CTGGTGCTCA 

(163) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 
^5 (A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



31 



30 



34 



(ii) lyiOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 
CAAGAAAGCC AAAGCCAAGA AACTGATCCT TCTG 
(164) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1068 base pairs 

25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

30 ATGGAAGATT TGGAGQAAAC ATTATTTGAA GAATTTGAAA ACTATTCCTA TGACCTAGAC 60 

TATTACTCTC TGGAGTCTGA TTTGGAGGAG AAAGTCCAGC TGGGAGTTGT TCACTGGGTC 120 

TCCCTGGTGT TATATTGTTT GGCTTTTGTT CTGGGAATTC CAGGAAATGC CATCGTCATT 180 

TGGTTCACGG GGCTCAAGTG GAAGAAGACA GTCACCACTC TGTGGTTCCT CAATCTAGCC 240 

ATT6CGGATT TCATTTTTCT TCTCTTTCTG CCCCTGTACA TCTCCTATGT GGCCATGAAT 300 
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TTCCACTGGC CCTTTGGCAT CTGGCTGTGC AAAGCCAATT CCTTCACTGC CCAGTTGAAC 360 

ATGTTTGCCA GTGTTTTTTT CCTGACAGTG ATCAGCCTGG ACCACTATAT CCACTTGATC 420 

CATCCTGTCT TATCTCATCG GCATCGAACC CTCAAGAACT CTCTGATTGT CATTATATTC 480 

ATCTGGCTTT TGGCTTCTCT AATTGGCGGT CCTGCCCTGT ACTTCCGGGA CACTGTGGAG 54 0 

TTCAATAATC ATACTCTTTG CTATAACAAT TTTCAGAAGC ATGATCCTGA CCTCACTTTG 600 

ATCAGGCACC ATGTTCTGAC TTGGGTGAAA TTTATCATTG GCTATCTCTT CCCTTTGCTA 660 

ACAATGAGTA TTTGCTACTT GTGTCTCATC TTCAAGGTGA AGAAGCGAAC AGTCCTGATC 720 

TCCAGTAGGC ATAAGTGGAC AATTCTGGTT GTGGTTGTGG CCTTTGTGGT TTGCTGGACT 780 

CCTTATCACC TGTTTAGCAT TTGGGAGCTC ACCATTCACC ACAATAGCTA TTCCC7VCCAT 840 

GTGATGCAGG CTGGAATCCC CCTCTCCACT GGTTTGGCAT TCCTCAATAG TTGCTTGAAC 900 

CCCATCCTTT ATGTCCTAAT TAGTAAGAAG TTCCJUVGCTC GCTTCCGGTC CTCAGTTGCT 960 

GAGATACTCA - AGTACACACT GTGGGAAGTC AGCTGTTCTG GCACAGTGAG TGAACAGCTC 1020 

AGGAACTCAG AAACCAAGAA TCTGTGTCTC CTGGAAACAG CTCAATAA 1068 
(165) INFORMATION FOR SEQ ID N0:164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 

Met Glu Asp Leu Glu Glu Thr Leu Phe Glu Glu Phe Glu Asn Tyr Ser 

1.5 10 15 

Tyr Asp Leu Asp Tyr Tyr Ser Leu Glu Ser Asp Leu Glu Glu Lys Val 
20 25 30 

Gin Leu Gly Val Val His Trp Val Ser Leu Val Leu Tyr Cys Leu Ala 
35 40 45 

Phe Val Leu Gly He Pro Gly Asn Ala He Val He Trp Phe Thr Gly 
50 55 60 

Leu Lys Trp Lys Lys Thr Val Thr Thr Leu Trp Phe Leu Asn Leu Ala 
^5 70 75 80 

He Ala Asp Phe He Phe Leu Leu Phe Leu Pro Leu Tyr He Ser Tyr 
85 90 95 
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Val Ala Met Asn Phe His Trp Pro Phe Gly He Trp Leu Cys Lys Ma 
100 105 110 

Asn Ser Phe Thr Ala Gin Leu Asn Met Phe Ala Ser Val Phe Phe Leu 
115 120 125 

Thr Val He Ser Leu Asp His Tyr He His Leu He His Pro Val Leu 
130 135 3^40 

Ser His Arg His Arg Thr Leu Lys Asn Ser Leu He Val He He Phe 

150 155 160 

He Trp Leu Leu. Ala Ser Leu He Gly Gly Pro Ala Leu Tyr Phe Arg 
165. 170 ;|L75 

Asp Thr Val Glu Phe Asn Asn His Thr Leu Cys Tyr Asn Asn Phe Gin 
.180 • 185 190 

Lys His Asp Pro Asp Leu Thr Leu He Arg His His Val Leu Thr Trp 
195 200 205 

Val Lys Phe He He Gly Tyr Leu Phe Pro Leu Leu Thr Met Ser He 
210 215 220 

Cys Tyr Leu Cys Leu He Phe Lys Val Lys Lys Arg Thr Val Leu He 
225 230 235 240 

Ser Ser Arg His Lys Trp Thr He Leu Val Val Val Val Ala Phe Val 
245 250 255 

Val Cys Trp Thr Pro Tyr His Leu Phe Ser He Trp Glu Leu Thr He 
260 265 270 

His His Asn Ser Tyr Ser His His Val Met Gin Ala Gly He Pro Leu 
275 280 285 

Ser Thr Gly Leu Ala Phe Leu Asn Ser Cys Leu Asn Pro He Leu Tyr 
290 - 295 300 

Val Leu He Ser Lys Lys Phe Gin Ala Arg Phe Arg Ser Ser Val Ala 

310 315 320 

Glu He Leu Lys Tyr Thr Leu Trp Glu Val Ser Cys Ser Gly Thr Val 
325 330 335 

Ser Glu Gin Leu Arg Asn Ser Glu Thr Lys Asn Leu Cys Leu Leu Glu 
340 345 350 

Thr Ala Gin 
355 

(166) INFORMATION FOR SEQ ID NO: 165: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1089 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

ATGGGCAACC ACACGTGGGA GGGCTGCCAC GTGGACTCGC GCGTGGACCA CCTCTTTCCG 60 

CCATCCCTCT ACATCTTTGT CATCGGCGTG GGGCTGCCCA CCAACTGCCT GGCTCTGTGG 120 

GCGGCCTACC GCCAGGTGCA ACAGCGCAAC GAGCTGGGCG TCTACCTGAT GAACCTCAGC 18 0 

ATCGCCGACC TGCTGTACAT CTGCACGCTG CCGCTGTGGG TGGACTACTT CCTGCACCAC 240 

GACAACTGGA TCCACGGCCC CGGGTCCTGC AAGCTCTTTG GGTTCATCTT CTACACCAAT 300 

ATCTACATCA GCATCGCCTT CCTGTGCTGC ATCTCGGTGG ACCGCTACCT GGCTGTGGCC 360 

CACCCACTCC GCTTCGCCCG CCTGCGCCGC GTCAAGACCG CCGTGGCCGT GAGCTCGGTG 420 

GTCTGGGCCA CGGAGCTGGG CGCCAACTCG GCGCCCCTGT TCCATGACGA GCTCTTCCGA 480 

GACCGCTACA ACCACACCTT CTGCTTTGAG AAGTTCCCCA TGGAAGGCTG GGTGGCCTGG 540 

ATGAACCTCT ATCGGGTGTT CGTGGGCTTC CTCTTCCCGT GGGCGCTCAT GCTGCTGTCG 600 

TACCGGGGCA TCCTGCGGGC CGTGCGGGGC AGCGTGTCCA CCGAGCGCCA GGAGAAGGCC 660 

AAGATCGCGC GGCTGGCCCT CAGCCTCATC GCCATCGTGC TGGTCTGCTT TGCGCCCTAT 720 

CACGTGCTCT TGCTGTCCCG CAGCGCCATC TACCTGGGCC GCCCCTGGGA CTGCGGCTTC 780 

GAGGAGCGCG TCTTTTCTGC ATACCACAGC TCACTGGCTT TCACCAGCCT CAACTGTGTG 840 

GCGGACCCCA TCCTCTACTG CCTGGTCAAC GAGGGCGCCC GCAGCGATGT GGCCAAGGCC 900 

CTGCACAACC TGCTCCGCTT TCTGGCCAGC GACAAGCCCC AGGAGATGGC CAATGCCTCG 960 

CTCACCCTG6 AGACCCCACT CACCTCCAAG AGGAACAGCA CAGCCAAAGC CATGACTGGC 1020 

AGCTGGGCGG CCACTCCGCC TTCCCAGGGG GACCAGGTGC AGCTGAAGAT GCTGCCGCCA 1080 

GCACAATGA ^^^^ 
(167) INFORMATION FOR SEQ ID NO:166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 362 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 166; 

Met Gly Asn His Thr Trp Glu Gly Cys His Val Asp Ser Arg Val Asp 
15 10 15 

His Leu Phe Pro Pro Ser Leu Tyr lie Phe Val lie Gly Val Gly Leu 
20 25 30 

Pro Thr Asn Cys Leu Ala Leu Trp Ala Ala Tyr Arg Gin Val Gin Gin 
35 40 45 

Arg Asn Glu Leu Gly Val Tyr Leu Met Asn Leu Ser lie Ala Asp Leu 
50 55 60 

Leu Tyr lie Cys Thr Leu Pro Leu Trp Val Asp Tyr Phe Leu His His 
65 70 75 80 

Asp Asn Trp lie His Gly Pro Gly Ser Cys Lys Leu Phe Gly Phe lie 
85 90 95 

Phe Tyr Thr Asn lie Tyr lie Se^ lie Ala Phe Leu Cys Cys lie Ser 
100 105 110 

Val Asp Arg Tyr Leu Ala Val Ala His Pro Leu Arg Phe Ala Arg Leu 
115 120 125 

Arg Arg Val .Lys Thr Ala Val Ala Val Ser Ser Val Val Trp Ala Thr 
130 135 140 

Glu Leu Gly Ala Asn Ser Ala Pro Leu Phe His Asp Glu Leu Phe Arg 
145 150 155 160 

Asp Arg Tyr Asn His Thr Phe Cys Phe Glu Lys Phe Pro Met Glu Gly 
165 170 175 

Trp Val Ala Trp Met Asn Leu Tyr Arg Val Phe Val Gly Phe Leu Phe 

180 ; 185 . 190 

Pro Trp Ala Leu Met Leu Leu Ser Tyr Arg Gly lie Leu Arg Ala Val 
195 200 205 

Arg Gly Ser Val Ser Thr Glu Arg Gin Glu Lys Ala Lys lie Ala Arg 
210 215 220 

Leu Ala Leu Ser Leu lie Ala lie Val Leu Val Cys Phe Ala Pro Tyr 
225 230 235 240 

His Val Leu Leu Leu Ser Arg Ser Ala He Tyr Leu Gly Arg Pro Trp 
245 250 255 

Asp Cys Gly Phe Glu Glu Arg Val Phe Ser Ala Tyr His Ser Ser Leu 
260 265 270 



Ala Phe Thr Ser Leu Asn Cys Val Ala Asp Pro He Leu Tyr Cys Leu 
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275 280 285 

Val Asn Glu Gly Ala Arg Ser Asp Val Ala Lys Ala Leu His Asn Leu 
290 295 300 

Leu Arg Phe Leu Ala Ser Asp Lys Pro Gin Glu Met Ala Asn Ala Ser 

310 315 320 

Leu Thr Leu Glu Thr Pro Leu Thr Ser Lys Arg Asn Ser Thr Ala Lys 
325 330 335 

Ala Met Thr Gly Ser Trp Ala Ala Thr Pro Pro Ser Gin Gly Asp Gin 
340 345 350 

Val Gin Leu Lys Met Leu Pro Pro Ala Gin 
355 360 

(168) INFORMATION FOR SEQ ID NO:167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1002 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 

ATGGAGTCCT CAGGCAACCC AGAGAGCACC ACCTTTTTTT ACTATGACCT TCAGAGCCAG 60 

CCGTGTGAGA ACCAGGCCTG GGTCTTTGCT ACCCTCGCCA CCACTGTCCT GTACTGCCTG 120 

GTGTTTCTCC TCAGCCTAGT GGGCAACAGC CTGGTCCTGT GGGTCCTGGT GAAGTATGAG 180 

AGCCTGGAGT CCCTCACCAA CATCTTCATC CTCAACCTGT GCCTCTCAGA CCTGGTGTTC 24 0 

GCCTGCTTGT TGCCTGTGTG GATCTCCCCA TACCACTGGG GCTGGGTGCT GGGAGACTTC 300 

CTCTGCAAAC TCCTCAATAT GATCTTCTCC ATCAGCCTCT ACAGCAGCAT CTTCTTCCTG 360 

ACCATCATGA CCATCCACCG CTACCTGTCG GTAGTGAGCC CCCTCTCCAC CCTGCGCGTC 420 

CCCACCCTCC GCTGCCGGGT GCTGGTGACC ATGGCTGTGT GGGTAGCCAG CATCCTGTCC 480 

TCCATCCTCG ACACCATCTT CCACAAGGTG CTTTCTTCGG GCTGTGATTA TTCCGAACTC 540 

ACGTGGTACC TCACCTCCGT CTACCAGCAC AACCTCTTCT TCCTGCTGTC CCTGGGGATT 600 

ATCCTGTTCT GCTACGTGGA GATCCTCAGG ACCCTGTTCC GCTCACGCTC CAAGCGGCGC 660 

CACCGCACGA AAAAGCTCAT CTTCGCCATC GTGGTGGCCT ACTTCCTCAG CTGGGGTCCC 720 

TACAACTTCA CCCTGTTTCT GCAGACGCTG TTTCGGACCC AGATCATCCG GAGCTGCGAG 780 
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GCCAAACAGC AGCTAGAATA CGCCCTGCTC ATCT6CCGCA ACCTCGCCTT CTCCCACTGC 840 

TGCTTTAACC CGGTGCTCTA TGTCTTCGTG GGGGTCAAGT TCCGCACACA CCTGAAACAT 900 

GTTCTCCGGC AGTTCTGGTT CTGCCGGCTG CAGGCACCCA GCCCAGCCTC GATCCCCCAC 960 

TCCCCTGGTG CCTTCGCCTA TGAGGGCGCC TCCTTCTACT GA 1002 
(169) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUEa^CE CHARACTERISTICS: 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : ^ 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

Met Glu Ser Ser Gly Asn Pro Glu Ser Thr Thr Phe Phe Tyr Tyr Asp 
15 10 15 

. Leu Gin Ser Gin Pro Cys Glu Asn Gin Ala Trp Val Phe Ala Thr Leu 
20 25 30 

Ala Thr Thr Val Leu Tyr Cys Leu Val Phe Leu Leu Ser Leu Val Gly 
35 40 45 

Asn Ser Leu Val, Leu Trp Val Leu Val Lys Tyr Glu Ser Leu Glu Ser 
50 55 60 

Leu Thr Asn lie Phe lie Leu Asn Leu Cys Leu Ser Asp Leu Val Phe 
65 70 75 80 

Ala Cys Leu Leu Pro Val Trp lie Ser Pro Tyr His Trp Gly Trp Val 
85 90 95 

Leu Gly Asp Phe Leu Cys Lys Leu Leu Asn Met He Phe Ser He Ser 
100 105 110 

Leu Tyr Ser Ser He Phe Phe Leu Thr He Met Thr He His Arg Tyr 
115 120 125 

Leu Ser Val Val Ser Pro Leu Ser Thr Leu Arg Val Pro Thr Leu Arg 
130 135 140 

Cys Arg Val Leu Val Thr Met Ala Val Trp Val Ala Ser He Leu Ser 
145 150 155 160 

Ser He Leu Asp Thr He Phe His Lys Val Leu Ser Ser Gly Cys Asp 
165 170 175 

Tyr Ser Glu Leu Thr Trp Tyr Leu Thr Ser Val Tyr Gin His Asn Leu 
180 185 190 



wo 00/22129 



PCT/US99^3938 



118 



Phe Phe Leu Leu Ser Leu Gly He He Leu Phe Cys Tyr Val Glu He 
195 200 205 

Leu Arg Thr Leu Phe Arg Ser Arg Ser Lys Arg Arg His Arg Thr Lys 
210 215 220 

Lys Leu He Phe Ala He Val Val Ala Tyr Phe Leu Ser Trp Gly Pro 
225 230 235 240 

Tyr Asn Phe Thr Leu Phe Leu Gin Thr Leu Phe Arg Thr Gin He He 
245 250 255 

Arg Ser Cys Glu Ala Lys Gin Gin Leu Glu Tyr Ala Leu Leu He Cys 
260 265 270 

Arg Asn Leu Ala Phe Ser His Cys Cys Phe Asn Pro Val Leu Tyr Val 
275 280 285 

Phe Val Gly Val Lys Phe Arg Thr His Leu Lys His Val Leu Arg Gin 
290 295 300 

Phe Trp Phe Cys Arg Leu Gin Ala Pro Ser Pro Ala Ser He Pro His 
305 310 315 320 

Ser Pro Gly Ala Phe Ala Tyr Glu Gly Ala Ser Phe Tyr 
325 330 



(170) INFORMATION FOR SEQ ID NO: 16 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 987 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:169: 

ATGGACAACG CCTCGTTCTC GGAGCCCTGG CCCGCCAACG CATCGGGCCC GGACCCGGCG 60 

CTGAGCTGCT CCAACGCGTC GACTCTGGCG CCGCTGCCGG CGCCGCTGGC GGTGGCTGTA 120 

CCAGTTGTCT ACGCGGTGAT CTGCGCCGTG GGTCTGGCGG GCAACTCCGC CGTGCTGTAC 18 0 

GTGTTGCTGC GGGCGCCCCG CATGAAGACC GTCACCAACC TGTTCATCCT CAACCTGGCC 240 

ATCGCCGACG AGCTCTTCAC GCTGGTGCTG CCCATCAACA TCGCCGACTT CCTGCTGCGG 300 

CAGTGGCCCT TCGGGGAGCT CATGTGCAAG CTCATCGTGG CTATCGACCA GTACAACACC 360 

TTCTCCAGCC TCTACTTCCT CACCGTCATG AGCGCCGACC GCTACCTGGT GGTGTTGGCC 420 

ACTGCGGAGT CGCGCCGGGT GGCCGGCCGC ACCTACAGCG CCGCGCGCGC GGTGAGCCTG 480 
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GCCGTGTGGG GGATCGTCAC ACTCGTCGTG CTGCCCTTCG' CAGTCTTCGC CCGGCTAGAC 540 

GACGAGCAGG GCCGGCGCCA GTGCGTGCTA GTCTTTCCGC AGCCCGAGGC CTTCTGGTGG 600 

CGCGCGAGCC GCCTCTACAC GCTCGTGCTG GGCTTCGCCA TCCCCGTGTC CACCATCTGT 660 

GTCCTCTATA CCACCCTGCT GTGCCGGCTG CATGCCATGC GGCTGGACAG CCACGCCAAG 720 

GCCCTGGAGC GCGCCAAGAA GCGGGTGAAG TTCCTGGTGG TGGCAATCCT GGCGGTGTGC 780 

CTCCTCTGCT GGACGCCCTA CCACCTGAGC ACCGTGGTGG CGCTCACCAC CGACCTCCCG 840 

CAGACGCCGC TGGTCATCGC TATCTCCTAC TTCATCACCA GCCTGACGTA CGCCAACAGC 900 

TGCCTCAACC CCTTCCTCTA CGCCTTCCTG GACGCCAGCT TCCGCAGGAA CCTCCGCCAG 960 

CTGATAACTT GCCGCGCGGC AGCCTGA 987 
(171) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 328 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

Met Asp Asn Ala Ser Phe Ser Glu Pro Trp Pro Ala Asn Ala Ser Gly 
15 10 15 

Pro Asp Pro Ala Leu Ser Cys Ser Asn Ala Ser Thr Leu Ala Pro Leu 
20 25 30 ' 

Pro Ala Pro Leu Ala Val Ala Val Pro Val Val Tyr Ala Val He Cys 
35 40 45 

Ala Val Gly Leu Ala Gly Asn Ser Ala Val Leu Tyr Val Leu Leu Arg 
'50 55 60 

Ala Pro Arg Met Lys Thr Val Thr Asn Leu Phe He Leu Asn Leu Ala 
^5 70 75 80 

He Ala Asp Glu Leu Phe Thr Leu Val Leu Pro He Asn He Ala Asp 
85 90 95 

Phe Leu Leu Arg Gin Trp Pro Phe Gly Glu Leu Met Cys Lys Leu He 
100 105 110 

Val Ala lie Asp Gin Tyr Asn Thr Phe Ser Ser Leu Tyr Phe Leu Thr 
115 120 125 

Val Met Ser Ala Asp Arg Tyr Leu Val Val Leu Ala Thr Ala Glu Ser 
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130 135 140 

Arg Arg Val Ala Gly Arg Thr Tyr Ser Ala Ala Arg Ala Val Ser Leu 

150 155 160 

Ala Val Trp Gly lie Val Thr Leu Val Val Leu Pro Phe Ala Val Phe 
165 170 

Ala Arg Leu Asp Asp Glu Gin Gly Arg Arg Gin Cys Val Leu Val Phe 
180 185 190 

Pro Gin Pro Glu Ala Phe Trp Trp Arg Ala Ser Arg Leu Tyr Thr Leu 
195 200 205 

Val Leu Gly Phe Ala He Pro Val Ser Thr He Cys Val Leu Tyr Thr 
210 215 220 

Thr Leu Leu Cys Arg Leu His Ala Met Arg Leu Asp Ser His Ala Lys 
225 230 235 240 

Ala Leu Glu Arg Ala Lys Lys Arg Val Lys Phe Leu Val Val Ala He 
245 250 255 

Leu Ala Val Cys Leu Leu Cys Trp Thr Pro Tyr His Leu Ser Thr Val 
260 265 270 

Val Ala Leu Thr Thr Asp Leu Pro Gin Thr Pro Leu Val He Ala He 
275 280 285 

Ser Tyr Phe He Thr Ser Leu Thr Tyr Ala Asn Ser Cys Leu Asn Pro 
290 295 300 

Phe Leu Tyr Ala Phe Leu Asp Ala Ser Phe Arg Arg Asn Leu Arg Gin 
305 310 315 320 

Leu He Thr Cys Arg Ala Ala Ala 
325 

(172) INFORMATION FOR SEQ ID NO : 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1002 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

ATGCAGGCCG CTGGGCACCC AGAGCCCCTT GACAGCAGGG GCTCCTTCTC CCTCCCCACG 60 

ATGGGTGCCA ACGTCTCTCA GGACAATGGC ACTGGCCACA ATGCCACCTT CTCCGAGCCA 120 

CTGCCGTTCC TCTATGTGCT CCTGCCCGCC GTGTACTCCG GGATCTGTGC TGTGGGGCTG 180 



720 
780 
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ACTGGCAACA CGGCCGTCAT CC^TAAI^ CTAAGGGCGC CCAAGATGAA GACGGTCACC 240 

AACG^TTCA TCCTGAACCT GGCCGTCGCC GACGGGCTCT TCACGCTGGT ACTGCCTGTC 300 

AACAT^GCGG AGCACCTGCT GCAGTACl^ CCCTTCGGGG AGC^CTCTG CAAGCl^GTG 3C0 

CTGGCCGTCG ACCACTACAA CATCTTCTCC AGCATCTACT TCCTAGCCGT GA^CGTG 420 

GACCGATACC TGGTGGTGCT GGCCACCGO^ AGGTCCCGCC ACATGCCCTG GCGCACCTAC 480 

CGGGGGGCGA AGGTCGCCAG CCTCTGTGTC TCGCTGGGCG TCACGGTCCT GGTTCTGCCC 540 

^CrrCTCTT TCGCTGGCGT CTACAGCAAC GAGCTGCAGG TCCCAAGCTG TGGGCTGAGC 600 

TTCCCGTGGC CCGAGCAGGT CTGGTTCAAG GCCAGCCGTG TCTACACGTT GGTCCTGGGC 660 
TTCGTGCTGC CCGTGTGCAC CATCTGTGTO CTCTACACAG ACCTCCTGCG CAGGCTGCGG 
GCCGTGCGGC TCCGCTCTGG AGCCAAGGCT CTAGGCAAGG CCAGGCGGAA GGTGAAAGTC 

CTGGTCCTCG TCGTGCTGGC CGTGTGCCTC CTCTGCTGGA CGCCCTTCCA CCTGGCCTCT 840 

GTCGTGGCCC TGACCACGGA CCTGCCCCAG ACCCCAC^ TCATCAGTAT GTCCTACGTC 900 

ATCACCAGCC TCACGTACGC CAACTCGTGC CI^AACCCCT TCCTCTACGC CTTTCTAGAT 960 
GACAACTTCC GGAAGAACTT CCGCAGCATA TTCCGGTGCT GA ^^^^ 
(173) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:172: 

Met Gin Ala Ala Gly His Pro Glu Pro Leu Asp Ser Arg Gly Ser Phe 

5 ^° , 15 

ser Leu Pro Thr Met Gly Ala Asn Val Ser Gin Asp Asn Gly Thr Gly 

2^ 30 

His Asn Ala Thr Phe Ser Glu Pro Leu Pro Phe Leu Tyr Val Leu Leu 

40 45 

Pro Ala val Tyr Ser Gly lie Cys Ala Val Gly Leu Thr Gly Asn Thr 

. 60 
Ala val lie Leu Val lie Leu Arg Ala Pro Lys Met Lys Thr Val Thr 



80 
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Asn Val Phe He Leu Asn Leu Ala Val Ala Asp Gly Leu Phe Thr Leu 
85 90 95 

Val Leu Pro Val Asn He Ala Glu His Leu Leu Gin Tyr Trp Pro Phe 
100 105 110 

Gly Glu Leu Leu Cys Lys Leu Val Leu Ala Val Asp His Tyr Asn He 
115 120 125 

Phe Ser Ser He Tyr Phe Leu Ala Val Met Ser Val Asp Arg Tyr Leu 
130 135 140 



Val Val Leu Ala Thr Val Arg Ser Arg His Met Pro Trp Arg Thr Tyr 
145 150 155 160 

Arg Gly Ala Lys Val Ala Ser Leu Cys Val Trp Leu Gly Val Thr Val 
165 170 175 

Leu Val Leu Pro Phe Phe Ser Phe Ala Gly Val Tyr Ser Asn Glu Leu 
180 185 190 

Gin Val Pro Ser Cys Gly Leu Ser Phe Pro Trp Pro Glu Gin Val Trp 
195 200 205 

Phe Lys Ala Ser Arg Val Tyr Thr Leu Val Leu Gly Phe Val Leu Pro 
210 215 220 

Val Cys Thr He Cys Val Leu Tyr Thr Asp Leu Leu Arg Arg Leu Arg 
225 230 235 240 

Ala Val Arg Leu Arg Ser Gly Ala Lys Ala Leu Gly Lys Ala Arg Arg 
245 250 255 

Lys Val Lys Val Leu Val Leu Val Val Leu Ala Val Cys Leu Leu Cys 
260 265 270 

Trp Thr Pro Phe His Leu Ala Ser Val Val Ala Leu Thr Thr Asp Leu 
275 280 285 



Pro Gin Thr Pro Leu Val He Ser Met Ser Tyr Val He Thr Ser Leu 
290 295 300 



Thr Tyr Ala Asn Ser Cys Leu Asn Pro Phe Leu Tyr Ala Phe Leu Asp 
305 310 315 320 

Asp Asn Phe Arg Lys Asn Phe Arg Ser He Leu Arg Cys 
325 330 

(174) INFORMATION FOR SEQ ID NO:173: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1107 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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■ (ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:173: 

ATGGTCCTTG AGGTGAGTGA CCACCAAGO^ CTAAAO^CG CCGAGGTTGC CGCCCTCCT^ 60 

GAGAACTTCA GCTCTTCCTA TGACTATGGA GAAAACGAGA GTGACTCGTG CTGTACCTCC 120 

5 CCGCCCTGCC CACAGGACTT CAGCCTGAAC TTCGACCGGG CCTTCCTGCC AGCCCTCTAC 180 

AGCCTCCTCT TTCTGCTGGG GCTGCTGGGC AACGGCGCGG ^CAGCCGT GCTGCTGAGC 240 

CGGCGGACAG CCCTGAGCAG CACCGACACC TTCCTGCTCC ACCTAGCTGT AGCAGACACG 300 

CTGCTGGTGC TGACACTGCC GCTCTGGGCA GTGGACGCTG CCGTCCAGTG GGTCTTTGGC 360 

TCTGGCCTCT GCAAAGTGGC AGGTGCCCTC TTCAACATCA ACITCTACGC AGGAGCCCTC 420 

10 CTGCTGGCCT GCATCAGCTT TGACCGCTAC CTGAACATAG TTCATGCCAC CCAGCTCTAC 480 

CGCCGGGGGC CCCCGGCCCG CGTGACCCTC ACCTGCGTGG CTGTCTGGGG GCTCTGCCTG 540 

CTTTTCGCCC TCCCAGACTT CATCTTCCTG TCGGCCCACC ACGACGAGCG CCTCAACGCC 600 

ACCCACTGCC AATACAACTT CCCACAGGTG GGCCGCACGG CTCTGCGGGT GCTGCAGC1X3 • 660 

GTGGC^CT TTCTGCTGCC CCTGCTGGTC ATGGCCTACT GCTATGCCCA CATCCTGGCC 720 

15. GTGCTGCTGG TTTCCAGGGG CCAGCGGCGC CTGCGGGCCA AGCGGCTGGT GGTGGTGGTC 780 

GTGGTGGCCT TTGCCCTCTG CTGGACCCCC TATCACCTGG TGGTGCTGGT GGACATCCTC 840 

ATGGACCTGG GCGC.TTGGC CCGCAACTGT GGCCGAGAAA GCAGGGTAGA CGTGGCCAAG 900 

TCGGTCACCT CAGGCCTGGG CTACATGCAC TGC^CCTCA ACCCGCTGCT CTATGCCTTT 960 
GTAGGGGTCA AGTTCCGGGA GCGGATGTGG ATGCTGCTCT TGCGCCTGGG CTGCCCCAAC 1020 
20 CAGAGAGGGC TCCAGAGGCA GCCATCGTCT TCCCGCCGGG ATTCATCCTO GTCTGAGACC 1080 
TCAGAGGCCT CCTACTCGGG CTTGTGA 

1107 

(175) INFORMATION FOR SEQ ID NO:174: 

(i) SEQUENCE CHARACTERISTICS: 

LENGTH: 368 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

* (D) TOPOUX3Y: not relevant 

(ii) MOLECUIiE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
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Met Val Leu Glu Val Ser Asp His Gin Val Leu Asn Asp Ala Glu Val 
15 10 15 

Ala Ala Leu Leu Glu Asn Phe Ser Ser Ser Tyr Asp Tyr Gly Glu Asn 
20 25 30 

Glu Ser Asp Ser Cys Cys Thr Ser Pro Pro Cys Pro Gin Asp Phe Ser 
35 40 45 

Leu Asn Phe Asp Arg Ala Phe Leu Pro Ala Leu Tyr Ser Leu Leu Phe 
50 55 60 

Leu Leu Gly Leu Leu Gly Asn Gly Ala Val Ala Ala Val Leu Leu Ser 
65 70 75 80 

Arg Arg Thr Ala Leu Ser Ser Thr Asp Thr Phe Leu Leu His Leu Ala 
85 90 95 

Val Ala Asp Thr Leu Leu Val Leu Thr Leu Pro Leu Trp Ala Val Asp 
100 105 110 

Ala Ala Val Gin Trp Val Phe Gly Ser Gly Leu Cys Lys Val Ala Gly 
115 120 125 

Ala Leu Phe Asn lie Asn Phe Tyr Ala Gly Ala Leu Leu Leu Ala Cys 
130 135 140 

He Ser Phe Asp Arg Tyr Leu Asn He Val His Ala Thr Gin Leu Tyr 
145 150 155 160 

Arg Arg Gly Pro Pro Ala Arg Val Thr Leu Thr Cys Leu Ala Val Trp 
165 170 175 

Gly Leu Cys Leu Leu Phe Ala Leu Pro Asp Phe He Phe Leu Ser Ala 
180 185 190 

His His Asp Glu Arg Leu Asn Ala Thr His Cys Gin Tyr Asn Phe Pro 
195 200 205 

Gin Val Gly Arg Thr Ala Leu Arg Val Leu Gin Leu Val Ala Gly Phe 
210 215 220 

Leu Leu Pro Leu Leu Val Met Ala Tyr Cys Tyr Ala His He Leu Ala 
225 230 235 240 

Val Leu Leu Val Ser Arg Gly Gin Arg Arg Leu Arg Ala Lys Arg Leu 
245 250 255 

Val Val Val Val Val Val Ala Phe Ala Leu Cys Trp Thr Pro Tyr His 
260 265 270 

Leu Val Val Leu Val Asp He Leu Met Asp Leu Gly Ala Leu Ala Arg 
275 280 285 



Asn Cys Gly Arg Glu Ser Arg Val Asp Val Ala Lys Ser Val Thr Ser 
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295 300 

Gly Leu Gly Tyr Met His Cys Cys Leu Asn Pro Leu Leu Tyr Ala Phe 



310 315 



320 



Val Gly Val Lys Phe Arg Glu Arg Met Trp Met Leu Leu Leu Arg Leu 
^ 325 330 33^ 

Gly Cys Pro Asn Gin Arg Gly Leu Gin Arg Gin Pro Ser Ser Ser Aro 
340 345 

Arg Asp Ser Ser Trp Ser Glu Thr Ser Glu Ala Ser Tyr Ser Gly Leu 
355 360 365 

10 (176) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1074 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 
ATGGCTGATG ACTATGGCTC TGAATCCACA TCTTCCATGG AAGACTACGT TAACTTCAAC 
TTCACTGACT TCTACTGTGA GAAAAACAAT GTCAGGCAGT TTGCGAGCCA TTTCCTCCCA 
20 CCCTTGTACT GGCTCGTGTT CATCGTGGGT GCCTTGGGCA ACAGTCTTGT TATCCTTGTC 
TACTGGTACT GCACAAGAGT GAAGACCATG ACCGACATGT TCCTTTTGAA TTTGGCAATT 
GCTGACCTCC TCTTTCTTGT CACTCTTCCC TTCTGGGCCA TTGCTGCTGC TGACCAGTGG 
AAGTTCCAGA CCTTCATGTG CAAGGTGGTC AACAGCATGT ACAAGATGAA CTTCTACAGC 
TGTGTGTTGC TGATCATGTG CATCAGCGTG GACAGGTACA TTGCCATTGC CCAGGCCATG 
25 AGAGCACATA CTT6GAGGGA GAAAAGGCTT TTGTACAGCA AAATGGTTTG CTTTACCATC 
T6GGTATTGG CAGCTGCTCT CTGCATCCCA GAAATCTTAT ACAGCCAAAT CAAGGAGGAA 
TCCGGCATTG CTATCTGCAC CATGGTTTAC CCTAGCGATG AGAGCACCAA ACTGAAGTCA 
GCTGTCTTGA CCCTGAAGGT CATTCTGGGG TTCTTCCTTC CCTTCGTGGT CATGGCTTGC 
TGCTATACCA TCATCATTCA CACCCTGATA CAAGCCAAGA AGTCTTCCAA GCACAAAGCC 
30 AAGAAAGTGA CCATCACTGT CCTGACCGTC TTTGTCTTGT CTCAGTTTCC CTACAACTGC 
ATTTTGTTGG TGCA6ACCAT TGACGCCTAT GCCATGTTCA TCTCCAACTG TGCCGTTTCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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ACCAACATTG ACATCTGCTT CCAGGTCACC CAGACCATCG CCTTCTTCCA CAGTTGCCTG 900 

AACCCTGTTC TCTATGTTTT TGTGGGTGAG AGATTCCGCC GGGATCTCGT GAAAACCCTG 960 

AAGAACTTGG GTTGCATCAG CCAGGCCCAG TGGGTTTCAT TTACAAGGAG AGAGGGAAGC 1020 

TTGAAGCTGT CGTCTATGTT GCTGGAGACA ACCTCAGGAG CACTCTCCCT CTGA 1074 
(177) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 357 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

Met Ala Asp Asp Tyr Gly Ser Glu Ser Thr Ser Ser Met Glu Asp Tyr 
15 10 15 

Val Asn Phe Asn Phe Thr Asp Phe Tyr Cys Glu Lys Asn Asn Val Arg 
20 25 30 

Gin Phe Ala Ser His Phe Leu Pro Pro Leu Tyr Trp Leu Val Phe He 
35 40 45 

Val Gly Ala Leu Gly Asn Ser Leu Val He Leu Val Tyr Trp Tyr Cys 
50 55 60 . 

Thr Arg Val hys Thr Met Thr* Asp Met Phe Leu Leu Asn Leu Ala He 
65 70 75 80 

Ala Asp Leu Leu Phe Leu Val Thr Leu Pro Phe Trp Ala He Ala Ala 
85 90 . 95 

Ala Asp Gin Trp Lys Phe Gin Thr Phe Met Cys Lys Val Val Asn Ser 
100 105 110 

Met Tyr Lys Met Asn Phe Tyr Ser Cys Val Leu Leu He Met Cys He 
115 120 125 

Ser Val Asp Arg Tyr He Ala He Ala Gin Ala Met Arg Ala His Thr 
130 135 140 

Trp Arg Glu Lys Arg Leu Leu Tyr Ser Lys Met Val Cys Phe Thr He 
145 150 155 160 

Trp Val Leu Ala Ala Ala Leu Cys He Pro Glu He Leu Tyr Ser Gin 
165 170 175 

He Lys Glu Glu Ser Gly He Ala He Cys Thr Met Val Tyr Pro Ser 
180 185 190 



10 
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Asp Glu ser Thr Lys Leu Lys Ser Ala Val Leu Thr Leu Lys Val He 
195 200 205 

Leu Gly Phe Phe Leu Pro Phe Val Val Met Ala Cys Cys Tyr Thr He 

215 220 

lie lie His Thr Leu He Gin Ala Lys Lys Ser Ser Lys His Lys Ala 

230 235 240 

Lys Lys Val Thr lie Thr Val Leu Thr Val Phe Val Leu Ser Gin Phe 
245 250 255 

Pro Tyr Asn Cys He Leu Leu Val Gin Thr He Asp Ala Tyr Ala Met 
260 265 270 

Phe He ser Asn Cys Ala Val Ser Thr Asn He Asp He Cys Phe Gin 
275 280 285 

Val Thr Gin Thr He Ala Phe Phe His Ser Cys Leu Asn Pro Val Leu 

295 300 

Tyr val Phe Val Gly Glu Arg Phe Arg Arg Asp Leu Val Lys Thr Leu 
305 



320 



20 



25 



Lys Asn Leu Gly Cys He Ser Gin Ala Gin Trp Val Ser Phe Thr Arg 
325 330 

Arg Glu Gly Ser Leu Lys Leu Ser Ser Met Leu Leu Glu Thr Thr Ser 

345 350 

Gly Ala Leu Ser Leu 
355 

{X78) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1110 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

ATGGCCTCAT CGACCACTCG GGGCCCCAGG GTTTCTGACT TATTTTCTGG GCTGCCGCCG 
GCGGTCACAA CTCCCGCCAA CCAGAGCGCA GAGGCCTCGG CGGGCAACGG GTCGGTSGCT 
GGCGCGGACG CTCCAGCCGT CACGCCCTTC CAGAGCCTGC AGCTGGTGCA TCAGC-TOAAG 
GGGCTGATCG TGCTGCTCTA CAGCGTCGTG GTGGTCGTGG GGCTGGTGGG CAACTGCCTG 
35 CTGGTGCTGG T6ATCGCGCG GGTGCCGCGG CTGCACAACG TGACGAACTT CCTCATCGGC 



30 
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AACCTGGCCT TGTCCGACGT GCTCATGTGC ACCGCCTGCG TGCCGCTCAC GCTGGCCTAT 360 

GCCTTCGAGC CACGCGGCTG GGTGTTCGGC GGCGGCCTGT GCCACCTGGT CTTCTTCCTG 420 

CAGCCGGTCA CCGTCTATGT GTCGGTGTTC ACGCTCACCA CCATCGCAGT GGACCGCTAC 480 

GTCGTGCTGG TGCACCCGCT GAGGCGCGCA TCTCGCTGCG CCTCAGCCTA CGCTGTGCTG 540 

GCCATCTGGG CGCTGTCCGC GGTGCTGGCG CTGCCGCCCG CCGTGCACAC CTATCACGTG 600 

GAGCTCAAGC CGCACGACGT GCGCCTCTGC GAGGAGTTCT GGGGCTCCCA GGAGCGCCAG 660 

CGCCAGCTCT ACGCCTGGGG GCTGCTGCTG GTCACCTACC TGCTCCCTCT GCTGGTCATC 720 

CTCCTGTCTT ACGTCCGGGT GTCAGTGAAG CTCCGCAACC GCGTGGTGCC GGGCTGCGTG 780 

ACCCAGAGCC AGGCCGACTG GGACCGCGCT CGGCGCCGGC GCACCAAATG CTTGCTGGTG 840 

GTGGTCGTGG TGGTGTTCGC CGTCTGCTGG CTGCCGCTGC ACGTCTTCAA CCTGCTGCGG 900 

GACCTCGACC CCCACGCCAT CGACCCTTAC GCCTTTGGGC TGGTGCAGCT GCTCTGCCAC 960 

TGGCXCGCCA TGAGTTCGGC CTGCTACAAC CCCTTCATCT ACGCCTGGCT GCACGACAGC 1020 

TTCCGCGAGG AGCTGCGCAA ACTGTTGGTC GCTTGGCCCC GCAAGATAGC CCCCCATGGC 1080 

CAGAATATGA CCGTCAGCGT GGTCATCTGA 1110 
(179) INFORMATION FOR SEQ ID NO: 178: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 369 amino acids 

(B) TYPE: amino acid 

(C) STRANBEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

Met Ala Ser Ser Thr Thr Arg Gly Pro Arg Val Ser Asp Leu Phe Ser 
15 10 15 

Gly Leu Pro Pro Ala Val Thr Thr Pro Ala Asn Gin Ser Ala Glu Ala 
20 25 30 

Ser Ala Gly Asn Gly Ser Val Ala Gly Ala Asp Ala Pro Ala Val Thr 
35 40 45 

Pro Phe Gin Ser Leu Gin Leu Val His Gin Leu Lys Gly Leu lie Val 
50 55 60 

Leu Leu Tyr Ser Val Val Val Val Val Gly Leu Val Gly Asn Cys Leu 
65 70 75 80 
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Leu Val Leu Val lie Ala Arg Val Pro Arg .eu His Asn Val Thr Asn 



90 



95 



Pha Leu lie Gly Asn Leu Ala Leu Ser Asp Val Leu Met Cys Thr Ala 

110 

cys val Pro Leu Thr Leu Ala Tyr Ala Phe Glu Pro Arg Gly Trp Val 
"5 120 

Phe Gly Gly Gly Leu Cys His Leu Val Phe Phe Leu Gin Pro Val Thr 

140 

val Tyr Val Ser Val Phe Thr Leu Thr Thr lie Ala Val Asp Arg Tyr 



155 



160 



val val Leu Val His Pro Leu Arg Arg Ala Ser Arg Cys Ala Ser Ala 
"5 175 



Tyr Ala Val Leu Ala lie Trp Ala Leu Ser Ala Val Leu Ala Leu Pro 
"° "5 190 

Pro Ala val His Thr Tyr His Val Glu Leu Lys Pro His Asp Val Arg 

200 205 

Leu cys Glu Glu Phe Trp Gly Ser Gin Glu Arg Gin Arg Gin Leu Tyr 

220 

Ala Trp Gly Leu Leu Leu Val Thr Tyr Leu Leu Pro Leu Leu Val He 

235 240 
Leu Leu ser Tyr Val Arg Val Ser Val Lys Leu Arg Asn Arg Val Val 

.^^^ 250 255 

Pro Gly cys Val Thr Gin Ser Gin Ala Asp Trp Asp Arg Ala Arg Arg 

265 270 
Arg Arg Thr Lys Cys Leu Leu Val Val Val Val Val Val Phe Ala Val 



280 



285 



cys Trp Leu Pro Leu His Val Phe Asn Leu Leu Arg Asp Leu Asp Pro 

2 95 



300 



His Ala He Asp Pro Tyr Ala Phe Gly Leu Val Gin Leu Leu Cys His 

315 320 

Trp Leu Ala Met Ser Ser Ala Cys Tyr Asn Pro Phe He Tyr Ala Trp 

325 Tan 

•^•^^ 335 

Leu His ASP ser Phe Arg Glu Glu Leu Arg Lys Leu Leu Val Ala Trp 

350 

Pro Arg Lys He Ala Pro His Gly Gin Asn Met Thr Val Ser Val Val 



365 

He 
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(180) INFORMATION FOR SEQ ID NO: 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 1083 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

ATGGACCCAG AAGAAACTTC AGTTTATTTG GATTATTACT AT6CTACGAG CCCAAACTCT 60 

GACATCAGGG AGACCCACTC CCATGTTCCT TACACCTCTG TCTTCCTTCC AGTCTTTTAC 120 

ACAGCTGTGT TCCTGACTGG AGTGCTGGGG AACCTTGTTC TCATGGGAGC GTTGCATTTC 180 

AAACCCGGCA GCCGAAGACT GATCGACATC TTTATCATCA ATCTGGCTGC CTCTGACTTC 240 

ATTTTTCTTG TCACATTGCC TCTCTGGGTG GATAAAGAAG CATCTCTAGG ACTGTGGAGG 300 

ACGGGCTCCT TCCTGTGCAA AGGGAGCTCC TACATGATCT CCGTCAATAT GCACTGCAGT 360 

GTCCTCCTGC TCACTTGCAT GAGTGTTGAC CGCTACCTGG CCATTGTGTG GCCAGTCGTA 420 

TCCAGGAAAT TCAGAAGGAC AGACTGTGCA TATGTAGTCT GTGCCAGCAT CTGGTTTATC 480 

TCCTGCCTGC TGGGGTTGCC TACTCTTCTG TCCAGGGAGC TCACGCTGAT TGATGATAAG 540 

CCATACTGTG CAGAGAAAAA GGCAACTCCA ATTAAACTCA TATGGTCCCT GGTGGCCTTA 600 

ATTTTCACCT TTTTTGTCCC TTTGTTGAGC ATTGTGACCT GCTACTGTTG CATTGCAAGG 660 

AAGCTGTGTG CCCATTACCA GCAATCAGGA AAGCACAACA AAAAGCTGAA GAAATCTAAG 720 

AAGATCATCT TTATTGTCGT GGCA6CCTTT CTTGTCTCCT GGCTGCCCTT CAATACTTTC 780 

AAGTTCCTGG CCATTGTCTC TGGGTTGCGG CAAGAACACT ATTTACCCTC AGCTATTCTT 840 

CAGCTTG6TA T6GAGGTGAG TGGACCCTTG GCATTTGCCA ACAGCTGTGT CAACCCTTTC 900 

ATTTACTATA TCTTCGACAG CTACATCCGC CGGGCCATTG TCCACTGCTT GTGCCCTTGC 960 

CTGAAAAACT ATGACTTTGG GAGTAGCACT GAGACATCAG ATAGTCACCT CACTAAGGCT 1020 

CTCTCCACCT TCATTCATGC AGAAGATTTT GCCAGGAGGA GGAAGAGGTC TGTGTCACTC 1080 
TAA 

1083 

(181) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 360 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 
Met Asp Pro Glu Glu Thr Ser Val Tyr Leu Asp Tyr Tyr Tyr Ala Thr 

Ser Pro Asn Ser Asp lie Arg Glu Thr His Ser His Val Pro Tyr Thr 
20 25 30 

Ser Val Phe Leu Pro Val Phe Tyr Thr Ala Val Phe Leu Thr Gly Val 
35 40 45 

Leu Gly Asn Leu Val Leu Met Gly Ala Leu His Phe Lys Pro Gly Ser 

5° . ■ 55 . 60 

Arg Arg Leu He Asp lie Phe He He Asn Leu Ala Ala Ser Asp Phe 
" 75 ■ 80 

He Phe Leu Val Thr Leu Pro Leu Trp Val Asp Lys Glu Ala Ser Leu 
85 - 90 95 

Gly Leu Trp Arg Thr Gly Ser Phe Leu Cys Lys Gly Ser Ser Tyr Met 
100 105 110 

He Ser Val Asn Met His Cys Ser Val Leu Leu Leu Thr Cys Met Ser 
115 120 125 

val Asp Arg Tyr Leu Ala He Val Trp Pro Val Val Ser Arg Lys Phe 
130 135 140 

Arg Arg Thr Asp Cys Ala Tyr Val Val Cys Ala Ser He Trp Phe He 
145 150 155 



160 



Ser Cys Leu Leu Gly Leu Pro Thr Leu Leu Ser Arg Glu Leu Thr Leu 

165 170 175 

He Asp Asp Lys Pro Tyr Cys Ala Glu Lys Lys Ala Thr Pro. He Lys 
180 185 190 

Leu He Trp Ser Leu Val Ala Leu He Phe Thr Phe Phe Val Pro Leu 
195 200 205 

Leu ser He Val Thr Cys Tyr Cys Cys He Ala Arg Lys Leu Cys Ala 
210 215 220 

His Tyr Gin Gin Ser Gly Lys His Asn Lys Lys Leu Lys Lys Ser Lys 

230 -235 240 

Lys He He Phe He Val Val Ala Ala Phe Leu Val Ser Trp Leu Pro 
245 250 255 
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Phe Asn Thr Phe Lys Phe Leu Ala lie Val Ser Gly Leu Arg Gin Glu 
260 265 270 

His Tyr Leu Pro Ser Ala He Leu Gin Leu Gly Met Glu Val Ser Gly 
275 280 285 

Pro Leu Ala Phe Ala Asn Ser Cys Val Asn Pro Phe He Tyr Tyr He 
290 295 300 

Phe Asp Ser Tyr He Arg Arg Ala He Val His Cys Leu Cys Pro Cys 
305 310 315 320 

Leu Lys Asn Tyr Asp Phe Gly Ser Ser Thr Glu Thr Ser Asp Ser His 
325 330 335 

Leu Thr Lys Ala Leu Ser Thr Phe He His Ala Glu Asp Phe Ala Arg 
340 345 350 

Arg Arg Lys Arg Ser Val Ser Leu 
355 360 

(182) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1020 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 

ATGAATGGCC TTGAAGTGGC TCCCCCAGGT CTGATCACCA ACTTCTCCCT GGCCACGGCA 60 

GAGCAATGTG GCCAGGAGAC GCCACTGGAG AACATGCTGT TCGCCTCCTT CTACCTTCTG 120 

GATTTTATCC TGGCTTTAGT TGGCAATACC CTGGCTCTGT GGCTTTTCAT CCGAGACCAC 180 

AAGTCCGGGA CCCCGGCCAA CGTGTTCCTG ATGCATCTGG CCGTGGCCGA CTTGTCGTGC 240 

GTGCTGGTCC TGCCCACCCG CCTGGTCTAC CACTTCTCTG GGAACCACTG GCCATTTGGG .300 

GAAATCGCAT GCCGTCTCAC CGGCTTCCTC TTCTACCTCA ACATGTACGC CAGCATCTAC 360 

TTCCTCACCT GCATCAGCGC CGACCGTTTC CTGGCCATTG TGCACCCGGT CAAGTCCCTC 420 

AAGCTCCGCA GGCCCCTCTA CGCACACCTG GCCTGTGCCT TCCTGTGGGT GGTGGTGGCT 480 

GTGGCCATGG CCCCGCTGCT GGTGAGCCCA CAGACCGTGC AGACCAACCA CACGGTGGTC 540 

TGCCTGCAGC TGTACCGGGA GAAGGCCTCC CACCATGCCC TGGTGTCCCT GGCAGTGGCC 600 

TTCACCTTCC CGTTCATCAC CACGGTCACC TGCTACCTGC TGATCATCCG CAGCCTGCGG 660 
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CAGGGCCTGC GTGTGC3AGAA GCGCCTCAAG ACCAAGGCAA AACGCATGAT CGCCATAGTG 720 

CTGGCCATCT TCCTGGTCTG CTTCGTGCCC TACCACGTCA ACCGCTCCGT CTACGTGCTG 780 

CACTACCGCA GCCATGGGGC CTCCTGCGCC ACCCAGCGCA TCCTGGCCCT GGCAAACCGC 840 

ATCACCTCCT GCCTCACCAG CCTCAACGGG GCACTCGACC CCATCATGTA TTTCTTCGTG 900 

GCTGAGAAGT TCCGCCACGC CCTGTGCAAC TTGCTCTGTG GCAAAAGGCT CAAGGGCCCG 960 

CCCCCCAGCT TCGAAGGGAA AACCAACGAG AGCTCGCTGA GTGCCAAGTC AGAGCTGTGA 1020 
(183) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 339 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 

Met Asn Gly Leu Glu Val Ala Pro Pro Gly Leu lie Thr Asn Phe Ser 
1 5 • 10 15 

Leu Ala Thr Ala Glu Gin Cys Gly Gin Glu Thr Pro Leu Glu Asn Met ' 
20 25 30 

Leu Phe Ala Ser Phe Tyr Leu Leu Asp Phe lie Leu Ala Leu Val Gly 
35 40 45 

Asn Thr Leu Ala Leu Trp Leu Phe lie Arg Asp His Lys Ser Gly Thr 
50 55 60 

Pro Ala Asn Val Phe Leu Met His Leu Ala Val Ala Asp Leu Ser Cys 
65 70 • 75 80 

Val Leu Val Leu Pro Thr Arg Leu Val Tyr His Phe Ser Gly Asn His 
85 * 90 95 

Trp Pro Phe Gly Glu lie Ala Cys Arg Leu Thr Gly Phe Leu Phe Tyr 
100 105 110 

Leu Asn Met Tyr Ala Ser He Tyr Phe Leu Thr Cys He Ser Ala Asp 
115 120 125 

Arg Phe Leu Ala He Val His Pro Val Lys Ser Leu Lys Leu Arg Arg 
130 135 140 

Pro Leu Tyr Ala His Leu Ala Cys Ala Phe Leu Trp Val Val Val Ala 
145 150 . 155 160 

Val Ala Met Ala Pro Leu Leu Val Ser Pro Gin Thr Val Gin Thr Asn 
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165 170 



175 



His Thr Val Val Cys Leu Gin Leu Tyr Arg Glu Lys Ala Ser His His 
180 185 190 

Ala Leu Val Ser Leu Ala Val Ala Phe Thr Phe Pro Phe lie Thr Thr 
195 200 205 

Val Thr Cys Tyr Leu Leu lie lie Arg Ser Leu Arg Gin Qly Leu Arg 
210 215 220 

Val Glu Lys Arg Leu Lys Thr Lys Ala Lys Arg Met lie Ala He Val 
225 230 235 240 

Leu Ala He Phe Leu Val Cys Phe Val Pro Tyr His Val Asn Arg Ser 
245 250 255 

val Tyr Val Leu His Tyr Arg Ser His Gly Ala Ser Cys Ala Thr Gin 
260 265 270 

Arg He Leu Ala Leu Ala Asn Arg He Thr Ser Cys Leu Thr Ser Leu 
15 275 280 285 

Asn Gly Ala Leu Asp Pro He Met Tyr Phe Phe Val Ala Glu Lys Phe 
290 295 300 



10 



20 



Arg His Ala Leu Cys Asn Leu Leu cys Gly Lys Arg Leu Lys Gly Pro 

310 315 320 

Pro Pro Ser Phe Glu Gly Lys Thr Asn Glu Ser Ser Leu Ser Ala Lys 
325 330 335 

Ser Glu Leu 



(183) INFORMATION FOR SEQ ID NO:183: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 996 base pairs 

(B) TYPE: nucleic acid 

(C) STRAMDBDNBSS : single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: DMA (genomic) 

(xi) SEQtJENCE DESCRIPTION: SEQ ID NO: 183: 

ATGATCACCC TGAACAATCA AGATCAACCT GTCCCTTTTA ACAGCTCACA TCCAGATGAA 60 

TACAAAATTG CAGCCCTTGT CTTCTATAGC TGTATCTTCA TAATTGGATT ATTTGTTAAC 120 

ATCACTGCAT TATGGGTTTT CAGTTGTACC ACCAAGAAGA GAACCACGGT AACCATCTAT 180 

35 ATGATGAATG TGGCATTAGT GGACTTGATA TTTATAATGA CTTTACCCTT TCGAATGTTT 240 
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TATTATGCAA AAGATGAATG GCCATTTGGA 6AGTACTTCT GCCAGATTCT TCGAGCTCTC 300 
ACAGTGTTTT ACCCAAGCAT TGCTTTATGG CTTCTTGCCT TTATTAGTGC TCACAGATAC 360 
ATGGCCATTG TACAGCCGAA GTACGCCAAA GAACTTAAAa' ACACGTGCAA AGCCGTGCTG 420 
GCGTGTGTGG GAGTCTGGAT AATGACCCTG ACCACGACCA CCCCTCTGCT ACTGCTCTAT 480 
AAAGACCCAG ATAAAGACTC CACTCCCGCC ACCTGCCTCA AGATTTCTGA CATCATCTAT 540 
CTAAAAGCTG TGAACGTGCT GAACCTCACT CGACTCACAT TTTTTTTCTT GATTCCTTTG 600 
TTCATCATGA TTGGGTGCTA CTTGGTCATT ATTCATAATC TCCTTCACGG CAGGACGTCT 660 
AAGCTGAAAC CCAAAGTCAA GGAGAAGTCC AAAAGGATCA TCATCACGCT GCTGGTGCAG 
GTGCTCGTCT GCTTTATGCC CTTCCACATC TOTTTCGCTT TCCTGATGCT GGGAACGGGG 
6A6AATAGTT ACAATCCCTG GGGAGCCTTT ACCACCTTCC TCATGAACCT CAGCACGTGT 
CTG6ATGTGA TTCTCTACTA CATCGTTTCA AAACAATTTC AGGCTCGAGT CATTAGTGTC 
ATGCTATACC GTAATTACCT TCGAAGCATC CGCA6AAAAA GTTTCCGATC TGGTAGTCTA 960 
AGGTCACTAA GCAATATAAA CAGTGAAATG TTATGA 
(185) INFORMATION FOR SEQ ID NO: 184: 



720 
780 
840 
900 



996 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 331 anu.no acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

Met He Thr Leu Asn Asn Gin Asp Gin Pro Val Pro Phe Asn Ser Ser 
^5 10 



15 



His Pro Asp Glu Tyr Lys He Ala Ala Leu Val Phe Tyr Ser Cys He 
2° 25 30 

Phe He He Gly Leu Phe Val Asn He Thr Ala Leu Trp Val Phe Ser 
3S 40 45 

cys Thr Thr Lys Lys Arg Thr Thr Val Thr He Tyr Met Met Asn Val 

55 60 

Ala Leu Val Asp Leu He Phe He Met Thr Leu Pro Phe Arg Met Phe 

75 80 

Tyr Tyr Ala Lys Asp Glu Trp Pro Phe Gly Glu Tyr Phe Cys Gin He 
85 90 g5 
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Leu Gly Ala Leu Thr Val Phe Tyr 
100 

Ala Phe lie Ser Ala Asp Arg Tyr 

115 120 

Ala Lys Glu Leu Lys Asn Thr Cys 
130 135 

Val Trp He Met Thr Leu Thr Thr 
145 150 



Pro Ser He Ala Leu Trp Leu Leu 
105 110 

Met Ala He Val Gin Pro Lys Tyr 
125 

Lys Ala Val Leu Ala Cys Val Gly 
140 

Thr Thr Pro Leu Leu Leu Leu Tyr 
155 160 



Lys Asp Pro Asp Lys 
165 

Asp He He Tyr Leu 
180 

Thr Phe Phe Phe Leu 
195 

Val He He His Asn 
210 

Lys Val Lys Glu Lys 
225 

Val Leu Val Cys Phe 
245 

Leu Gly Thr Gly Glu 
260 

Phe Leu Met Asn Leu 
275 

Val Ser Lys Gin Phe 
290 



Asp Ser Thr Pro Ala Thr 
170 

Lys Ala Val Asn Val Leu 
185 

He Pro Leu Phe He Met 
200 

Leu Leu His Gly Arg Thr 
215 

Ser Lys Arg He He He 
230 235 

Met Pro Phe His He Cys 
250 

Asn Ser Tyr Asn Pro Trp 
265 

Ser Thr Cys Leu Asp Val 
280 

Gin Ala Arg Val He Ser 
295 



Cys Leu Lys He Ser 
175 

Ash Leu Thr Arg Leu 
190 

He Gly Cys Tyr Leu 
205 

Ser Lys Leu Lys Pro 
220 

Thr Leu Leu Val Gin 
240 

Phe Ala Phe Leu Met 
255 

Gly Ala Phe Thr Thr 
270 

He Leu Tyr Tyr He 
285 

Val Met Leu Tyr Arg 
300 



Asn Tyr Leu Arg Ser Met Arg Arg Lys Ser Phe Arg Ser Gly Ser Leu 
305 310 315 320 

Arg Ser Leu Ser Asn He Asn Ser Glu Met Leu 
325 330 



(186) INFORMATION FOR SEQ ID NO:185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1077 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

ATGCCCTCTG TGTCTCCAGC GGGGCCCTCG GCCGGGGCAG TCCCCAATCC CACCGOVGTG 60 

ACAACAGTGC GGACCAATGC CAGCGGGCTG GAGGTGCCCC TGTTCCACCT GTTTGCCCGG 120 

CTGGACGAGG AGCTGCATGG CACCTTCCCA GGCCTGTGCG TGGCGCTGAT GGCGGTGCAC 180 

5 GGAGCCATCT TCCTGGCAGG GCTGGTGCTC AACGGGCTGG CGCTGTACGT CTTCTGCTGC 240 

C6CACCCGGG CCAAGACACC CTCAGTCATC TACACCATCA ACCTGGTGGT GACCGATCTA 300 

CTGGTAGGGC TGTCCCTGCC CACGCGCTTC GCTGTGTACT ACGGCGCCAG GGGCTGCCTG 360 

CGCTGTGCCT.TCCCGCACGT CCTCGGTTAC TTCCTCAACA TGCACTGCTC CATCCTCTTC 420 

CTCACCTGCA TCTGCGTGGA CCGCTACCTG GCCATCGTCC GGCCCGAAGG CTCCCGCCGC 480 

0 TOCCGCCAGC CTGCCTGTGC CAGGGCCGTG TGCGCCTTCG TGTGGCTGGC CGCCGGTGCC 540 . 

GTCACCCTGT CGGTGCTGGG CGTGACAGGC AGCCGGCCCT GCTGCCGTGT CTTTGCGCTG 600 

ACTGTCCTGG AGTTCCTGCT GCCCCTGCTG GTCATCAGCG TGTTTACCGG CCGCATCAT6 660 

TGTGCACTGT CGCGGCCGGG TCTGCTCCAC CAGGGTCGCC AGCGCCGCGT GCGGGCCAAG 720 

CAGCTCCTGC TCACGGTGCT CATCATCTTT CTCGTCTGCT TCACGCCCTT CCACGCCCGC .780 

CAAGTGGCCG TGGCGCTGTG GCCCGACATG CCACACCACA CGAGCCTCGT GGTCTACCAC 840 

GTGGCCGTGA CCCTCAGCAG CCTCAACAGC TGCATGGACC CCATCGTCTA CTGCTTCGTC 900 

ACCAGTGGCT TCCAGGCCAC CGTCCGAGGC CTCTTCGGCC AGCACGGAGA GCGTGAGCCC 960 

AGCAGCGGTG ACGTGGTCAG CATGCACAGG AGCTCCAAGG GCTCAGGCCG TCATCACATC 1020 

CTCAGTGCCG GCCCTCACGC CCTCACCCAG GCCCTGGCTA ATGGGCCCGA GGCTTAG 1077 
(187) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 358 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

Met Pro Ser Val Ser Pro Ala Gly Pro Ser Ala Gly Ala .Val Pro Asn 
5 10 15 

Ala Thr Ala Val Thr Thr Val Arg Thr Asn Ala Ser Gly Leu Glu Val 
20 25 



30 
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Pro Leu Phe His Leu Phe Ala Arg Leu Asp Glu Glu Leu His Gly Thr 
35 40 45 

Phe Pro Gly Leu Cys Val Ala Leu Met Ala Val His Gly Ala lie Phe 
50 55 60 

Leu Ala Gly Leu Val Leu Asn Gly Leu Ala Leu Tyr Val Phe Cys Cys 
65 70 75 80 

Arg Thr Arg Ala Lys Thr Pro Ser Val lie Tyr Thr lie Asn Leu Val 
85 90 95 

Val Thr Asp Leu Leu Val Gly Leu Ser Leu Pro Thr Arg Phe Ala Val 
100 105 110 

Tyr Tyr Gly Ala Arg Gly Cys Leu Arg Cys Ala Phe Pro His, Val Leu 
115 120 125 

Gly Tyr Phe Leu Asn Met His Cys Ser lie Leu Phe Leu Thr Cys lie 
130 135 140 

Cys Val Asp Arg Tyr Leu Ala lie Val Arg Pro Glu Gly Ser Arg Ala 
145 150 155 160 

Cys Arg Gin Pro Ala Cys Ala Arg Ala Val Cys Ala Phe Val Trp Leu 
165 170 175 

Ala Ala Gly Ala Val Thr Leu Ser Val Leu Gly Val Thr Gly Ser Arg 
180 185 190 

Pro Cys Cys Arg Val Phe Ala Leu Thr Val Leu Glu Phe Leu Leu Pro 
195 200 205 

Leu Leu Val He Ser Val Phe Thr Gly Arg He Met Cys Ala Leu Ser 
.210 215 220 

Arg Pro Gly Leu Leu His Gin Gly Arg Gin Arg Arg Val Arg Ala Lys 
225 230 235 240 

Gin Leu Leu Leu Thr Val Leu He He Phe Leu Val Cys Phe Thr Pro 
245 250 255 

Phe His Ala Arg Gin Val Ala Val Ala Leu Trp Pro Asp Met Pro His 
260 265 270 

His Thr Ser Leu Val Val Tyr His Val Ala Val Thr Leu Ser Ser Leu 
275 280 285 

Asn Ser Cys Met Asp Pro He Val Tyr Cys Phe Val Thr Ser Gly Phe 
290 295 300 

Gin Ala Thr Val Arg Gly Leu Phe Gly Gin His Gly Glu Arg Glu Pro 
305 310 315 320 



Ser Ser Gly Asp Val Val Ser Met His Arg Ser Ser Lys Gly Ser Gly 
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325 



335 



10 



Arg His His lie Leu Ser Ala Gly Pro His Ala Leu Thr Gin Ala Leu 
. 350 

Ala Asn Gly Pro Glu Ala 

355 

(188) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1050 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 
ATGAACTCCA CCTTGGATGG TAATCAGAGC AGCCACCCTT TTTCCCTCTT GGCATTTGGC 60 
15 TATTTGGAAA CTGTCAATTT TTGCCTTTTG GAAGTATTGA TTATTCTCTT TCTAACTGTA 120 
TTGATTATTT CTGGCAACAT CATTGTGATT TTTOTATTTC ACTGTGCACC TTTCTTGAAC 
CATCACACTA CAAGTTATTT TATCCAGACT ATGGCATATG CTGACCTTTT TGTTGGGGTG 
AGCTGCGTGG TCCCTTCTTT ATCACTCCTC CATCACCCCC TTCCAGTAGA GGAGTCCTTG 
ACTTGCCA6A TATTTGGrTT TGTAGTATCA GTTCTGAAGA GCGTCTCCAT GGCTTCTCTG 
20 GCCTGTATCA GCATTGATAG ATACATTGCC ATTACTAAAC CTTTAACCTA TAATACTCTG 
GTTACACCCT GGAGACTACG CCTGTGTATT TTCCTGATTT GGCTATACTC GACCCT<3GTC 
TTCCTGCCTT CCTTTTTCCA CTGGGGCAAA CCTGGATATC ATCGAGATGT GTTTCAGTGG 
TGTGCGGA6T CCTGGCACAC CGACTCCTAC TTCACCCTGT TCATCGTGAT GATGTTATAT 
GCCCCAGCAG CCCTTATTGT CTGCTTCACC TATTTCAACA TCTTCCGCAT CTGCCAACAG 
25 CACACAAAGG ATATCAGCGA AAGGCAAGCC CGCTTCAGCA GCCAGAGTGG GGAGACTGGG 
GAAGTGCAGG CCTGTCCTGA TAAGCGCTAT AAAATGGTCC TGTTTCGAAT CACTAGTGTA 
TTTTACATCC TCTGGTTGCC ATATATCATC TACTTCTTGT TCGAAAGCTC CACTGGCCAC 
AGCAACCGCT TCGCATCCTT CTTQACCACC -roGCrrGCTA ITAGTAACAG ITTCTGCAAC 900 
TGT^TAArrT ATAGTCTCTC CAACAGTGTA TTCCAAAGAG GACTAAAGCG CCTCTCAGGG 960 

30 GCTATGTGTA CTTCTTGTCC AAGTCAGACT ACAGCCAACG ACCCTTACAC AGTTAGAAGC 1020 
AAAGGCCCTC TTAATGGATG TCATATCTGA 



180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 



1050 
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(189) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 349 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

ID) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 



Met Asn Ser Thr Leu Asp Gly Asn Gin Ser Ser His Pro Phe Cys Leu 
15 10 15 

Leu Ala Phe Gly Tyr Leu Glu Thr Val Asn Phe Cys Leu Leu Glu Val 
20 25 30 

Leu He He Val Phe Leu Thr Val Leu He He Ser Gly Asn He He 
35 40 45 

Val He Phe Val Phe His Cys Ala Pro Leu Leu Asn His His Thr Thr 
50 55 60 

Ser Tyr Phe He Gin Thr Met Ala Tyr Ala Asp Leu Phe Val Gly Val 
^5 70 75 80 

Ser Cys Val Val Pro Ser Leu Ser Leu Leu His His Pro Leu Pro Val 
85 • 90 95 

Glu Glu Ser Leu Thr Cys Gin He Phe Gly Phe Val Val Ser Val Leu 
100 105. 110 

Lys Ser Val Ser Met Ala Ser Leu Ala Cys He Ser He Asp Arg Tyr 
115 120 125 

He Ala He Thr Lys Pro . Leu Thr Tyr Asn Thr Leu Val Thr Pro Trp 
130 135 140 

Arg Leu Arg Leu Cys He Phe Leu He Trp Leu Tyr Ser Thr Leu Val 

150 155 160 

Phe Leu Pro Ser Phe Phe His Trp Gly Lys Pro Gly Tyr His Gly Asp 
165 170 175 

Val Phe Gin Trp Cys Ala Glu Ser Trp His Thr Asp Ser Tyr Phe Thr 
180 185 190 

Leu Phe He Val Met Met Leu Tyr Ala Pro Ala Ala Leu He Val Cys 
195 200 205 



Phe Thr Tyr Phe Asn He Phe Arg He Cys Gin Gin His Thr Lys Asp 
210 215 220 
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lie Ser Glu Arg Gin Ala Arg Phe Ser Ser Gin Ser Gly Glu Thr Gly 
225 230 235 240 

Glu Val Gin Ala Cys Pro Asp Lys Arg Tyr Lys Met Val Leu Phe Arg 
245 250 255 

5 He Thr Ser Val Phe Tyr He Leu Trp Leu Pro Tyr He He Tyr Phe 

260 265 270 

Leu Leu Glu Ser Ser Thr Gly His Ser Asn Arg Phe Ala Ser Phe Leu 
275 280 285 

Thr Thr Trp Leu Ala He Ser Asn Ser Phe Cys Asn Cys Val He Tvr 
0 290 295 300 

Ser Leu Ser Asn Ser Val Phe Gin Arg Gly Leu Lys Arg Leu Ser Gly 

310 315 320 



5 



Ala Met Cys Thr Ser Cys Ala Ser Gin Thr Thr Ala Asn Asp Pro Tyr 
325 330 335 

Thr Val Arg Ser Lys Gly Pro Leu Asn Gly Cys His He 
340 345 

(190) INFORMATION FOR SEQ ID NO:189: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 13 02 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

ATGTGTTTTT CTCCCATTCT GGAAATCAAC ATGCAGTCTG AATCTAACAT TACAGTCCGA 60 

6ATGACATTG ATGACATCAA CACCAATATG TACCAACCAC TATCATATCC GTTAAGCTTT 120 

CAAGTOTCTC TCACCGGATT TCTTATGTTA GAAATTGTGT TGGGACTTGG CAGCAACCTC 180 

ACTGTATTGG TACTTTACTG CATGAAATCC AACTTAATCA ACTCTGTCAG TAACATTATT 240 

ACAATGAATC TTCATGTACT TGATGTAATA ATTTGTGTGG GATGTATTCC TCTAACTATA 300 

GTTATCCTTC TGCTTTCACT GGAGA6TAAC ACTGCTCTCA TTTGCTGTTT CCAT6AGGCT 360 

TGTGTATCTT TTGCAAGTGT CTCAACAGCA ATCAACGTTT TTGCTATCAC TTTGGACAGA 420 

TATGACATCT CTGTAAAACC TGCAAACCGA ATTCTGACAA TGGGCAGAGC TGTAATGTTA 480 

ATGATATCCA rTTOGATTTT TTCTTTTTTC TCTTTCCTGA TTCCTTTTAT TGAGGTAAAT 540 

TTTTTCAGTC TTCAAAGTGG AAATACCTGG OAAAACAAGA CACTTTTATG TGTCAGTACA 600 
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AATGAATACT ACACTGAACT GGGAATGTAT TATCACCTGT TAGTACAGAT CCCAATATTC 660 

TTTTTCACTG TTGTAGTAAT GTTAATCACA TACACCAAAA TACTTCAGGC TCTTAATATT 720 

CGAATAGGCA CAAGATTTTC AACAGGGCAG AAGAAGAAAG C7VAGAAAGAA AAAGACAATT 780 

TCTCTAACCA CACAACATGA GGCTACAGAC ATGTCACAAA GCAGTGGTGG GAGAAATGTA 840 

GTCTTTGGTG TAAGAACTTC AGTTTCTGTA ATAATTGCCC TCCGGCGAGC TGTGAAACGA 900 

CACCGTGAAC GACGAGAAAG ACAAAAGAGA GTCAAGAGGA TGTCTTTATT GATTATTTCT 960 

ACATTTCTTC TCTGCTGGAC ACCAATTTCT GTTTTAAATA CCACCATTTT ATGTTTAGGC 1020 

CCAAGTGACC TTTTAGTAAA ATTAAGATTG TGTTTTTTAG TCATGGCTTA TGGAACAACT 1080 

ATATTTCACC CTCTATTATA TGCATTCACT AGACAAAAAT TTCAAAAGGT CTTGAAAAGT 1140 

AAAATGAAAA AGCGAGTTGT TTCTATAGTA GAAGCTGATC CCCTGCCTAA TAATGCTGTA 1200 

ATACACAACT CTTGGATAGA TCCCAAAAGA AACAAAAAAA TTACCTTTGA AGATAGTGAA 1260 

ATAAGAGAAA AACGTTTAGT GCCTCAGGTT GTCACAGACT AG 1302 
(191) INFORMATION FOR SEQ ID NO:190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 433 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

Met Cys Phe Ser Pro lie Leu Glu He Asn Met Gin Ser Glu Ser Asn 
1 5 . 10 15 

He Thr Val Arg Asp Asp He Asp Asp He Asn Thr Asn Met Tyr Gin 
20 25 30 

Pro Leu Ser Tyr Pro Leu Ser Phe Gin Val Ser Leu Thr Gly Phe Leu 
35 40 45 

Met Leu Glu He Val Leu Gly Leu Gly Ser Asn Leu Thr Val Leu Val 
50 55 60 

Leu Tyr Cys Met Lys Ser Asn Leu He Asn Ser Val Ser Asn He He 
65 70 75 80 

Thr Met T^n Leu His Val Leu Asp Val He He Cys Val Gly Cys He 
85 90 95 

Pro Leu Thr He Val He Leu Leu Leu Ser Leu Glu Ser Asn Thr Ala 
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100 105 110 

Leu lie Cys Cys Phe His Glu Ala Cys Val Ser Phe Ala Ser Val Ser 

120 125 

Thr Ala lie Asn Val Phe Ala lie Thr Leu Asp Arg Tyr Asp lie Ser 
130 135 140 

Val Lys Pro Ala Asn Arg lie Leu Thr Met Gly Arg Ala Val Met Leu 
"5 150 155 160 

Met lie Ser lie Trp lie Phe Ser Phe Phe Ser Phe Leu lie Pro Phe 
165 170 

He Glu Val Asn Phe Phe Ser Leu Gin Ser Gly Asn Thr Trp Glu Asn 
180 185 190 

Lys Thr Leu Leu Cys Val Ser Thr Asn Glu Tyr Tyr Thr Glu Leu Gly 
195 200 205 

Met Tyr Tyr His Leu Leu Val Gin He Pro He Phe Phe Phe Thr Val 
210 215 220 

Val Val Met Leu He Thr Tyr Thr Lys He Leu Gin Ala Leu Asn He 
225 230 235 240 

Arg He Gly Thr Arg Phe Ser Thr Gly Gin Lys Lys Lys Ala Arg Lys 
245 250 255 

Lys Lys Thr He Ser Leu Thr Thr Gin His Glu Ala Thr Asp Met Ser 
260 265 270 

Gin Ser Ser Gly Gly Arg Asn Val Val Phe Gly Val Arg Thr Ser Val 
275 280 285 

Ser Val He He Ala" Leu Arg Arg Ala Val Lys Arg His Arg Glu Arg 
290 295 300 

Arg Glu Arg Gin Lys Arg Val Lys Arg Met Ser Leu Leu He He Ser 

310 315 320 

Thr Phe Leu Leu Cys Trp Thr Pro He Ser Val Leu Asn Thr Thr He 
325 330 335 

Leu Cys Leu Gly Pro Ser Asp Leu Leu Val Lys Leu Arg Leu Cys Phe 
340 345 350 

Leu Val Met Ala Tyr Gly Thr Thr He Phe His Pro Leu Leu Tyr Ala 
355 . 360 365 

Phe Thr Arg Gin Lys Phe Gin Lys Val Leu Lys Ser Lys Met Lys Lys 
370 375 380 

Arg Val Val Ser He Val Glu Ala Asp Pro Leu Pro Asn Asn Ala Val 
"5 390 395 400 
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He His Asn Ser Trp He Asp Pro Lys Arg Asn Lys Lys He Thr Phe 

405 410 415 

Glu Asp Ser Glu He Arg Glu Lys Arg Leu Val Pro Gin Val Val Thr 
420 425 430 

Asp 

(192) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1209 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

ATGTTGTGTC CTTCCAAGAC AGATGGCTCA GGGCACTCTG GTAGGATTCA CCAGGAAACT 60 

CATGGAGAAG GGAAAAGGGA CAAGATTAGC AACA6TGAAG 6GAGGGAGAA TGGTGGGAGA 120 

GGATTCCAGA TGAACGGTGG GTCGCTGGAG GCTGAGCATG CCAGCAGGAT GTCAGTTCTC 180 

AGAGCAAAGC CCATGTCAAA CAGCCAACGC TTGCTCCTTC TGTCCCCAGG ATCACCTCCT 240 

CGCACGGGGA GCATCTCCTA CATCAACATC ATCATGCCTT CGGTGTTCGG CACCATCTGC 300 

CTCCTGGGCA TCATCGGGAA CTCCACGGTC ATCTTCGCGG TCGTGAAGAA GTCCAAGCTG 360 

CACTGGTGCA ACAACGTCCC CGACATCTTC ATCATCAACC TCTCGGTAGT AGATCTCCTC 420 

TTTCTCCTGG GCATGCCCTT CATGATCCAC CAGCTCATGG GCAATGGGGT GTGGCACTTT 480 

GGGGAGACCA TGTGCACCCT CATCACGGCC ATGGATGCCA ATAGTCAGTT CACCAGCACC 540 

TACATCCTGA CCGCCATGGC CATTGACCGC TACCTGGCCA CTGTCCACCC CATCTCTTCC 600 

ACGAAGTTCC GGAAGCCCTC TGTGGCCACC CTGGTGATCT GCCTCCTGTG GGCCCTCTCC 660 

TTCATCAGCA TCACCCCTGT GTGGCTGTAT GCCAGACTCA TCCCCTTCCC AGGAGGTGCA 720 

GTGGGCTGCG GCATACGCCT GCCCAACCCA GACACTGACC TCTACTGGTT CACCCTGTAC 780 

CAGTTTTTCC TGGCCTTTGC CCTGCCTTTT GTGGTCATCA CAGCCGCATA CGTGAGGATC 840 

CTGCAGCGCA TGACGTCCTC AGTGGCCCCC GCCTCCCAGC GCAGCATCCG GCTGCGGACA 900 

AAGAGGGTGA AACGCACAGC CATCGCCATC TGTCTGGTCT TCTTTGTGTG CTGGGCACCC 960 

TACTATGTGC TACAGCTGAC CCAGTTGTCC ATCAGCCGCC CGACCCTCAC CTTTGTCTAC 1020 

TTATACAATG CGGCCATCAG CTTGGGCTAT GCCAACAGCT GCCTCAACCC CTTTGTGTAC 1080 
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ATCGTCCTCT OTOAOACO^ CCGCAAACCC TO^CTCCTOT CGGTOAAGCC TGCAGCCCAG XX40 
GGGCAGCrrc GCGCTGXOVG CAACGCTCAG ACGGCI^CG AGGAGAGGAC AGAAAOO^ ^200 
GGCACCTCA 

1209 

(193) INFORMATION FOR SEQ ID NO: 192; 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 402 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

Met Leu Cys Pro Ser Lys Thr Asp Gly Ser Gly His Ser Gly Arg He 
^ 10 15 . 

His Gin Glu Thr His Gly Glu Gly Lys Arg Asp Lys He Ser Asn Ser 

2^ 30 
Glu Gly Arg Glu Asn Gly Gly Arg Gly Phe Gin Met Asn Gly Gly ser 

45 

I-eu Glu Ala Glu His Ala Ser -Arg Met Ser Val .eu Arg Ala I.ys Pre 

^5 60 

«.t .er A» ser ^ ^ ^„ 

80 

Arg Thr Gly Ser lie Ser ^r He Asn lie He Met Pro Ser Val Phe 
" 90 95 

Gly Thr Xle Cys Leu Leu Gly xie He Gly Asn Ser Thr Val He Phe 

110 

Ala val val Lys Lys Ser Lys Leu His Trp Cys Asn Asn Val Pro Asp 

120 3^25 

He Phe He He Asn Leu Ser Val vai t 

130 n« ^ ^^"^ ^eu Leu Gly 

■^■^^ 140 

«« P„ Pbe mt n. Hi. 01, .eu Met 01, ^ oi, v.l Trp Hi. Ph, 

-^^^ 160 
Gly Glu Thr Met Cys Thr Leu He Thr Ala Met Asp Ala Asn Ser Gin 

175 

Phe T>„ ser Thr ^ xle l.„.Thr «a Met Ala Ue ^ ^ ^ 

•^^^ 190 
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195 200 ^ 205 

Ala Thr Leu Val He Cys Leu Leu Trp Ala Leu Ser Phe He Ser He 
210 215 220 

Thr Pro Val Trp Leu Tyr Ala Arg Leu He Pro Phe Pro Gly Gly Ala 
225 230 235 240 

Val Gly Cys Gly He TUrg Leu Pro Asn Pro Asp Thr Asp Leu Tyr Trp 
245 250 255 

Phe Thr Leu Tyr Gin Phe Phe Leu Ala Phe Ala Leu Pro Phe Val Val 
260 265 270 

He Thr Ala Ala Tyr Val Arg He Leu Gin Arg Met Thr Ser Ser Val 
275 280 285 

Ala Pro Ala Ser Gin Arg Ser He Arg Leu Arg Thr Lys Arg Val Lys 
290 295 300 

Arg Thr Ala He Ala He Cys Leu Val Phe Phe Val Cys Trp Ala Pro 
305 310 315 320 

Tyr Tyr Val Leu Gin Leu Thr Gin Leu Ser He Ser Arg Pro Thr Leu 
325 330 335 

Thr Phe Val Tyr Leu Tyr Asn Ala Ala He Ser Leu Gly Tyr Ala Asn 
340 345 350 

Ser Cys Leu Asn Pro Phe Val Tyr He Val Leu Cys Glu Thr Phe Arg 
355 360 365 

Lys Arg Leu Val Leu Ser Val Lys Pro Ala Ala Gin Gly Gin Leu Arg 
370 375 380 

Ala Val Ser Asn Ala Gin Thr Ala Asp Glu Glu Arg Thr Glu Ser Lys 
385 390 395 400 

Gly Thr 

(194) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STIU^EDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 
ATGGATGTGA CTTCCCAAGC CCGGGGCGTG GGCCTGGAGA TGTACCCAGG CACCGCGCAC 60 



GCTGCGGCCC CCAACACCAC CTCCCCCGAG CTCAACCTGT CCCACCCGCT CCTGGGCACC 120 
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GCCCTGGCCA ATGGGACAGG TGAGCTCTCG GAGCACCAGC AGTACGTGAT CGGCCTGTTC 180 
CTCTCGTGCC TCTACACCAT CITCCTCTTC CCCATCGGCT rTOTGGGCAA CATCCTGATC 240 
CTGGTGGTGA ACATCAGCTT CCGCGAGAAG ATGACCATCC CCGACCTGTA CTTCATCAAC 300 
CTGGCGGTGG CGGACCTCAT CCTGGTGGCC GACTCCCTCA TTGAGGTCTT CAACCTGCAC 360 
GAGCGGTACT ACGACATCGC CGTCCTGTGC ACCTTCATGT CGCTCTTCCT GCAGGTCAAC 420 
ATGTACAGCA GCGTCTTCTT CCTCACCTGG ATGAGCTTCG ACCGCTACAT CGCCCTCGCC 480 
AGGGCCATGC GCTGCAGCCT GTTCCGCACC AAGCACCACG CCCGGCTGAG CTGTGGCCTC 540 
ATCTGGATGG CATCCGTGTC AGCCACGCTG GTGCCCTTCA CCGCCGTGCA CCTGCA6CAC 600 
ACCGACGAGG CCTGCTTCTG TTTCGCGGAT GTCCGGGAGG TGCAGTGGCT CGAGGTCACG 660 
CTGGGCTTCA TCGTGCCCTT CGCCATCATC GGCCTGTGCT ACTCCCTCAT TGTCCGGGTG 720 
CTGGTCA6GG CGCACCGGCA CCGTGGGCTG CGGCCCCGGC GGCAGAAGGC GAAACGCATG 780 
ATCCTCGCGG TGGTGCTGGT CTTCTTCGTC TGCTGGCTGC CGGAGAACGT CTTCATCAGC 840 
GTGCACCTCC TGCAGCGGAC GCAGCCTGGG GCCGCTCCCT GCAAGCAGTC TTTCCGCCAT 900 
GCCCACCCCC TCACGGGCCA CATTGTCAAC CTCGCCGCCT TCTCCAACAG CTGCCTAAAC 960 
CCCCTCATCT ACAGCTTTCT CGGGGAGACC TTCAGGGACA AGCTGAGGCT GTACATTGAG 1020 
CAGAAAACAA ATTTGCCGGC CCTGAACCGC TTCTGTCACG CTGCCCTGAA GGCCGTCATT 1080 
CCAGACAGCA CCGAGCAGTC GGATGTGAGG TTCAGCAGTG CCGTGTGA 1128 
(195) INFORMATION FOR SEQ ID NO:194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

Met Asp Val Thr Ser Gin Ala Arg Gly Val Gly Leu Glu Met Tyr Pro 

^ 10 15 

Gly Thr Ala His Ala Ala Ala Pro Asn Thr Thr Ser Pro Glu Leu Asn 

^° 25 30 . 

Leu ser His Pro Leu Leu Gly Thr Ala Leu Ala Asn Gly Thr Gly Glu 



40 45 
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Leu Ser Glu His Gin Gin Tyr Val He Gly Leu Phe Leu Ser Cys Leu 
50 55 60 

Tyr Thr He Phe Leu Phe Pro He Gly Phe Val Gly Asn He Leu He 
65 70 75 80 

5 Leu Val Val Asn He Ser Phe Arg Glu Lys Met Thr He Pro Asp Leu 

85 90 95 

Tyr Phe He Asn Leu Ala Val Ala Asp Leu He Leu Val Ala Asp Ser 
100 105 110 

Leu He Glu Val Phe Asn Leu His Glu Arg Tyr Tyr Asp He Ala Val 
10 115 120 125 

Leu Cys Thr Phe Met Ser Leu Phe Leu Gin Val Asn Met Tyr Ser Ser 
130 135 140 

Val Phe Phe Leu Thr Trp Met Ser Phe Asp Arg Tyr He Ala Leu Ala 
145 150 155 160 

15 Arg Ala Met Arg Cys Ser Leu Phe Arg Thr Lys His His Ala Arg Leu 

165 170 175 

' Ser Cys Gly Leu He Trp Met Ala Ser Val Ser Ala Thr Leu Val Pro 
180 185 190 

Phe Thr Ala Val His Leu Gin His Thr Asp Glu Ala Cys Phe Cys Phe 
20 195 200 205 

Ala Asp Val Arg Glu Val Gin Trp Leu Glu Val Thr Leu Gly Phe He 
210 215 220 

Val Pro Phe Ala He He Gly Leu Cys Tyr Ser Leu Jle Val Arg Val 
225 230 235 240 

25 Leu Val Arg Ala His Arg His Arg Gly Leu Arg Pro Arg Arg Gin Lys 

245 250 255 

Ala Lys Arg Met He Leu Ala Val Val Leu Val Phe Phe Val Cys Trp 
260 265 270 

Leu Pro Glu Asn Val Phe He Ser Val His Leu Leu Gin Arg Thr Gin 
30 275 280 285 

Pro Gly Ala Ala Pro Cys Lys Gin Ser Phe Arg His Ala His Pro Leu 
290 295 300 

Thr Gly His He Val Asn Leu Ala Ala Phe Ser Asn Ser Cys Leu Asn 
305 310 315 320 

35 Pro Leu He Tyr Ser Phe Leu Gly Glu Thr Phe Arg Asp Lys Leu Arg 

325 330 335 



Leu Tyr He Glu Gin Lys Thr Asn Leu Pro Ala Leu Asn Arg Phe Cys 
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340 345 350 

His Ala Ala Leu Lys Ala Val lie Pro Asp Ser Thr Glu Gin Ser Asp 
355 360 365 

Val Arg Phe Ser Ser Ala Val 
370 375 

(196) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHTU^CTERISTICS : 

(A) LENGTH: 960 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:195: 

ATGCCATTCC CAAACTGCTC AGCCCCCAGC ACTGTGGTGG CCACAGCTGT GGGTGTCTTG 60 

CTGGGGCTGG AGTGTGGGCT GGGTCTGCTG GGCAACGCGG TGGCGCTGTG GACCTTCCTG 120 

TTCCGGGTCA GGGTGTGGAA GCCGTACGCT GTCTACCTGC TCAACCTGGC CCTGGCTGAC 180 

CTGCTGTTGG CTGCGTGCCT GCCTTTCCTG GCCGCCTTCT ACCTGAGCCT CCAGGCTTGG 240 

CATCTGGGCC GTGTGGGCTG CTGGGCCCTG CGCTTCCTGC TGGACCTCAG CCGCAGCGTG 300 

GGGATGGCCT TCCTGGCCGC CGTGGCTTTG GACCGGTACC TCCGTGTGGT CCACCCTCGG 360 

CTTAAGGTCA ACCTGCTGTC TCCTCAGGCG GCCCTGGGGG TCTCGGGCCT CGTCTGGCTC 420 

CTGATGGTCG CCCTCACCTG CCCGGGCTTG CTCATCTCTG AGGCCGCCCA GAACTCCACC 480 

AGGTGCCACA GTTTCTACTC CAGGGCAGAC GGCTCCTTCA GCATCATCTG GCAGGAAGCA 540 

CTCTCCTGCC TTCAGTTTGT CCTCCCCTTT GGCCTCATCG TGTTCTGCAA TGCAGGCATC 600 

ATCAGGGCTC TCCAGAAAAG ACTCCGGGAG CCTGAGAAAC AGCCCAAGCT TCAGCGGGCC 660 

AAGGCACTGG TCACCTTGGT GGTGGTGCTG TTTGCTCTGT GCTTTCTGCC CTGCTTCCTG 720 

GCCAGAGTCC TGATGCACAT CTTCCAGAAT CTGGGGAGCT GCAGGGCCCT TTGTGCAGTG 780 

GCTCATACCT CGGATGTCAC GGGCAGCCTC ACCTACCTGC ACAGTGTCGT CAACCCCGTG 840 

GTATACTGCT TCTCCAGCCC CACCTTCAGG AGCTCCTATC GGAGGGTCTT CCACACCCTC 900 

CGAGGCAAAG GGCAGGCAGC AGAGCCCCCA GATTTCAACC CCAGAGACTC CTATTCCTGA 960 

(197) INFORMATION FOR SEQ ID N0:196: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 319 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 

Met Pro Phe Pro Asn Cys Ser Ala Pro Ser Thr Val Val Ala Thr Ala 
15 10 15 

Val Gly Val Leu Leu Gly Leu Glu Cys Gly Leu Gly Leu Leu Gly Asn 
20 25 30 

Ala Val Ala Leu Trp Thr Phe Leu Phe Arg Val Arg Val Trp Lys Pro 
35 40 45 

Tyr Ala Val Tyr Leu Leu Asn Leu Ala Leu Ala Asp Leu Leu Leu Ala 
50 55 60 

Ala Cys Leu Pro Phe Leu Ala Ala Phe Tyr Leu Ser Leu Gin Ala Trp 
^5 70 75 80 

His Leu Gly Arg Val Gly Cys Trp Ala Leu Arg Phe Leu Leu Asp Leu 
85 90 95 

Ser Arg Ser Val Gly Met Ala Phe Leu Ala Ala Val Ala Leu Asp Arg 
100 105 110 

Tyr Leu Arg Val Val His Pro Arg Leu Lys Val Asn Leu Leu Ser Pro 
115 120 125 

Gin Ala Ala Leu Gly Val Ser Gly Leu Val Trp Leu Leu Met Val Ala 
130 135 140 

Leu Thr Cys Pro Gly Leu Leu He Ser Glu Ala Ala Gin Asn Ser Thr 
145 150- 155 160 

Arg Cys His Ser Phe Tyr Ser Arg Ala Asp Gly Ser Phe Ser He He 
165 170 175 

Trp Gin Glu Ala Leu Ser Cys Leu Gin . Phe Val Leu Pro Phe Gly Leu 
180 185 190 

He Val Phe Cys Asn Ala Gly He He Arg Ala Leu Gin Lys Arg Leu 
195 200 205 

Arg Glu Pro Glu Lys Gin Pro Lys Leu Gin Arg Ala Lys Ala Leu Val 
210 215 220 

Thr Leu Val Val Val Leu Phe Ala Leu Cys Phe Leu Pro Cys Phe Leu 
225 230 235 240 

Ala Arg Val Leu Met His He Phe Gin Asn Leu Gly Ser Cys Arg Ala 
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245 250 255 

Leu Cys Ala Val Ala His Thr Ser Asp Val Thr Gly Ser Leu Thr Tyr 
260 265 270 

Leu His Ser Val Val Asn Pro Val Val Tyr Cys Phe Ser Ser Pro Thr 
^ 275 280 285 

Phe Arg Ser Ser Tyr Arg Arg Val Phe His Thr Leu Arg Gly Lys Glv 
290 295 300 

Gin Ala Ala Glu Pro Pro Asp Phe Asn Pro Arg Asp Ser Tyr Ser 
305 310 315 

10 (198) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1143 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

ATG6AGGAAG GTGGTGATTT TGACAACTAC TATGGGGCAG ACAACCAGTC TGAGTGTGAG 60 

TACACAGACT GGAAATCCTC GGGGGCCCTC ATCCCTGCCA TCTACATGTT GGTCTTCCTC 120 

20 CTGGGCACCA CGGGAAACGG TCTGGTGCTC TGGACCGTGT TTCGGAGCAG CCGGGAGAAG 180 

AGGCGCTCAG CTGATATCTT CATTGCTAGC CTQGCGGTGG CTGACCTGAC CTTCGTGGTG 240 

ACGCTGCCCC TGTGGGCTAC CTACACGTAC CGGGACTATG ACTGGCCCTT TGGGACCTTC 300 

TTCTGCAAGC TCAGCAGCTA CCTCATCTTC GTCAACATGT ACGCCAGCGT CTTCTGCCTC 360 

ACCGGCCTCA GCTTCGACCG CTACCTGGCC ATCGTGAGQC CAGTCGCCAA TGCTCGGCTG 420 

25 AGGCTGCGGG TCAGCGGGGC C6TGGCCACG 6CAGTTCTTT GGGTGCTGGC CGCCCTCCTG 480 

GCCATGCCTG TCATGGTGTT ACGCACCACC GGGGACTTGG AQAACACCAC TAAGGTGCAG 540 

TGCTACATGG ACTACTCCAT GGTGGCCACT GTGAGCTCAG AGTCGGCCTG GQAGGTGGGC 600 

CTTGGGGTCT CGTCCACCAC CGTGGGCTTT GT6GTOCCCT TCACCATCAT GCTGACCTGT 660 

TACTTCTTCA TCGCCCAAAC CATCGCTGGC CACTTCCGCA AGGAACGCAT CGAGGGCCTG 720 

30 CGQAAGCGGC GCCGGCTTAA GAGCATCATC GTGGTCCTGG TGGTGACCTT TGCCCTGTGC 780 

TGGATGCCCT ACCACCTGGT GAAGAC6CTG TACATGCTGG GCAGCCTGCT GCACTGGCCC 840 

TGTGACTTTG ACCTCTTCCT CATGAACATC TTCCCCTACT GCACCTGCAT CAGCTACGTC 900 
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AACAGCTGCC TCAACCCCTT CCTCTATGCC TTTTTCGACC CCCGCTTCCG CCAGGCCTGC 960 

ACCTCCATGC TCTGCTGTGG CCAGAGCAGG TGCGCAGGCA CCTCCCACAG CAGCAGTGGG 1020 

GAGAAGTCAG CCAGCTACTC TTCGGGGCAC AGCCAGGGGC CCGQCCCCAA CATCGGCAAG 1080 

GGTGGAGAAC AGATGCACGA GAAATCCATC CCCTACAGCC AGGAGACCCT TGTGGTTGAC 1140 



(199) INFORMATION FOR SEQ ID N0:198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 380 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Met Glu Glu Gly Gly Asp Phe Asp Asn Tyr Tyr Gly Ala Asp Asn Gin 
1 5 10 15 

Ser Glu Cys Glu Tyr Thr Asp Trp Lys Ser Ser Gly Ala Leu lie Pro 
20 25 30 

Ala lie Tyr Met Leu Val Phe Leu Leu Gly Thr Thr Gly Asn Gly Leu 
35 40 45 

Val Leu Trp Thr Val Phe Arg Ser Ser Arg Glu Lys Arg Arg Ser Ala 
50 55 60 

Asp lie Phe lie Ala Ser Leu Ala Val Ala Asp Leu Thr Phe Val Val 
65 70 75 80 

Thr Leu Pro Leu Trp Ala Thr Tyr Thr Tyr Arg Asp Tyr Asp Trp Pro 
85 90 95 

Phe Gly Thr Phe Phe Cys Lys Leu Ser Ser Tyr Leu lie Phe Val Asn 
100 105 110 

Met Tyr Ala Ser Val Phe Cys Leu Thr Gly Leu Ser Phe Asp Arg Tyr 
115 120 125 

Leu Ala lie Val Arg Pro Val Ala Asn Ala Arg Leu Arg Leu Arg Val 
130 135 140 

Ser Gly Ala Val Ala Thr Ala Val Leu Trp Val Leu Ala Ala Leu Leu 
145 150 155 160 

Ala Met Pro Val Met Val Leu Arg Thr Thr Gly Asp Leu Glu Asn Thr 



TAG 



1143 



165 



170 



175 
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10 



15 



Thr Lys Val Gin Cys Tyr Met Asp Tyr Ser Met Val Ala Thr Val Ser 

190 



180 185 



ser Glu Trp Ala Trp Glu Val Gly Leu Gly Val Ser Ser Thr Thr Val 
195 200 205 

Gly Phe Val Val Pro Phe Thr He. Met Leu Thr Cys Tyr Phe Phe He 
210 215 220 

Ala Gin Thr He Ala Gly His Phe Arg Lys Glu Arg He Glu Gly Leu 

230 235 ^240 

Arg Lys Arg Arg Arg Leu Lys Ser He He Val Val Leu Val Val Thr 
245 250 255 

Phe Ala Leu Cys Trp Met Pro Tyr His Leu Val Lys Thr Leu Tyr Met 
260 265 270 

Leu Gly ser Leu Leu His Trp Pro Cys Asp Phe Asp Leu Phe Leu Met 

280 285 

Asn He Phe Pro Tyr Cys Thr Cys He Ser Tyr Val Asn Ser Cys Leu 
290 295 300 

Pro Phe Leu Tyr Ala Phe Phe Asp Pro Arg Phe Arg Gin Ala Cys 
3X0 3^^, 

Thr ser Met Leu Cys Cys Gly Gin Ser Arg Cys Ala Gly Thr Ser His 

325 

ser Ser Ser Gly Glu Lys Ser Ala Ser Tyr Ser Ser Gly His Ser Gin 
340 345 

Gly Pro Gly Pro Asn Met Gly Lys Gly Gly Glu Gin Met His Glu Lys 
355 360 365 

Ser He Pro Tyr Ser Gin Glu Thr Leu Val Val Asd 

375 38? 

(200) INFORMATION FOR SEQ ID NO: 199: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 1119 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 
ATGAACTACC CGCTAACGCT GGAAATGGAC CTCGA6AACC TGGAGGACCT GrrCTGGGAA 60 
CTGGACAGAT TGGACAACTA TAACGACACC TCCCTGGTCG AAAATCATCT CTGCCCTGCC X20 
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ACAGAGGGTC CCCTCATGGC CTCCTTCAAG GCCGTGTTCG TGCCCGTGGC CTACAGCCTC 180 

ATCTTCCTCC TGGGCGTGAT CGGCAACGTC CTGGTGCTGG TGATCCTGGA GCGGCACCGG 240 

CAGACACGCA GTTCCACGGA GACCTTCCTG TTCCACCTGG CCGTGGCCGA CCTCCTGCTG 3 00 

GTCTTCATCT TGCCCTTTGC CGTGGCCGAG GGCTCTGTGG GCTGGGTCCT GGGGACCTTC 360 

CTCTGC;^^ CTGTGATTGC CCTGCACAAA GTCAACTTCT ACTGCAGCAG CCTGCTCCTG 420 

GCCTGCATCG CCGTGGACCG CTACCTGGCC ATTGTCCACG CCGTCCATGC CTACCGCCAC 480 

CGCCGCCTCC TCTCCATCCA CATCACCTGT GGGACCATCT GGCTGGTGGG CTTCCTCCTT 540 

GCCTTGCCAG AGATTCTCTT CGCCAAAGTC AGCCAAGGCC ATCACAACAA CTCCCTGCCA 600 

CGTTGCACCT TCTCCCAAGA GAACCAAGCA GAAACGCATG CCTGGTTCAC CTCCCGATTC 660 

CTCTACCATG TGGCGGGATT CCTGCTGCCC ATGCTGGTGA TGGGCTGGTG CTACGTGGGG 720 

GTAGTGCACA GGTTGCGCCA GGCCCAGCGG CGCCCTCAGC GGCAGAAGGC AAAAAGGGTG 780 

GCCATCCTGG TGACAAGCAT CTTCTTCCTC TGCTGGTCAC CCTACCACAT CGTCATCTTC 840 

CTGGACACCC TGGCGAGGCT GAAGGCCGTG GACAATACCT GCAAGCTGAA TGGCTCTCTC 900 

CCCGTGGCCA TCACCATGTG TGAGTTCCTG GGCCTGGCCC ACTGCTGCCT CAACCCCATG 960 

CTCTACACTT TCGCCGGCGT GAAGTTCCGC AGTGACCTGT CGCGGCTCCT GACCAAGCTG 1020 

GGCTGTACCG GCCCTGCCTC CCTGTGCCAG CTCTTCCCTA GCTGGCGCAG GAGCAGTCTC 1080 

TCTGAGTCAG AGAATGCCAC CTCTCTCACC ACGTTCTAG 1119 
(201) INFORMATION FOR SEQ ID N0:200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 372 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

• (D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 

Met Asn Tyr Pro Leu Thr Leu Glu Met Asp Leu Glu Asn Leu Glu Asp 
15 10 15 

Leu Phe Trp Glu Leu Asp Arg Leu Asp Asn Tyr Asn Asp Thr Ser Leu 
20 25 30 

Val Glu Asn His Leu Cys Pro Ala Thr Glu Gly Pro Leu Met Ala Ser 
35 40 45 

• Phe Lys Ala Val Phe Val Pro Val Ala Tyr Ser Leu lie Phe Leu Leu 
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50 55 60 

Gly Val He Gly Asn Val Leu Val Leu Val He Leu Glu Arg His Arg 
65 70 75 80 

Gin Thr Arg Ser Ser Thr Glu Thr Phe Leu Phe His Leu Ala Val Ala 
85 90 95 

Asp Leu Leu Leu Val Phe He Leu Pro Phe Ala Val Ala Glu Gly Ser 
100 105 110 

Val Gly Trp Val Leu Gly Thr Phe Leu Cys Lys Thr Val He Ala Leu 
115 120 125 

His Lys Val Asn Phe Tyr Cys Ser Ser Leu Leu Leu Ala Cys He Ala 
130 135 140 

Val Asp Arg Tyr Leu Ala He Val His Ala Val His Ala Tyr Arg His 
145 150 155 160 

Arg Arg Leu Leu Ser He His He Thr Cys Gly Thr He Trp Leu Val 
165 170 175 

Gly Phe Leu Leu Ala Leu Pro Glu He Leu Phe Ala Lys Val Ser Gin 
180 185 190 

Gly His His Asn Asn Ser Leu Pro Arg Cys Thr Phe Ser Gin Glu Asn 
195 200 205 

Gin Ala Glu Thr His Ala Trp Phe Thr Ser Arg Phe Leu Tyr His Val 
210 215 220 

Ala Gly Phe Leu Leu Pro Met Leu Val Met Gly Trp Cys Tyr Val Gly 
225 230 235 240 

Val Val His Arg Leu Arg Gin Ala Gin Arg Arg Pro Gin Arg Gin Lys 
245 250 255 

Ala Lys Arg Val Ala He Leu Val Thr Ser He Phe Phe Leu Cys Trp 
260 265 270 

Ser Pro Tyr His He Val He Phe Leu Asp Thr Leu Ala Arg Leu Lys 
275 280 285 

Ala Val Asp Asn Thr Cys Lys Leu Asn Gly Ser Leu Pro Val Ala He 
290 295 300 

Thr Met Cys Glu Phe Leu Gly Leu Ala His Cys Cys Leu Asn Pro Met 
305 310 315 320 

Leu Tyr Thr Phe Ala Gly Val Lys Phe Arg Ser Asp Leu Ser Arg Leu 
325 330 335 

Leu Thr Lys Leu Gly Cys Thr Gly Pro Ala Ser Leu Cys Gin Leu Phe 
340 345 350 
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Pro Ser Trp Arg Arg Ser Ser Leu Ser Glu Ser Glu Asn Ala Thr Ser 
355 360 365 

Leu Thr Thr Phe 
370 

(202) INFORMATION FOR SEQ ID NO:201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 

ATGGATGTGA CTTCCCAAGC CCGGGGCGTG GGCCTGGAGA TGTACCCAGG CACCGCGCAG 60 

CCTGCGGCCC CCAACACCAC CTCCCCCGAG CTCAACCTGT CCCACCCGCT CCTGGGCACC 120 

GCCCTGGCCA ATGGGACAGG TGAGCTCTCG GAGCACCAGC AGTACGTGAT CGGCCTGTTC 180 

CTCTCGTGCC TCTACACCAT CTTCCTCTTC CCCATCGGCT TTGTGGGCAA CATCCTGATC 240 

CTGGTGGTGA ACATCAGCTT CCGCGAGAAG ATGACCATCC CCGACCTGTA CTTCATCAAC 300 

CTGGC<3GTGG CGGACCTCAT CCTGGTGGCC GACTCCCTCA TTGAGGTGTT CAACCTGCAC 360 

GAGCGGTACT ACQACATCGC CGTCCTGTGC ACCTTCATGT CGCTCTTCCT GCAGGTCAAC 420 

ATGTACAGCA GCGTCTTCTT CCTCACCTGG ATGAGCTTCG ACCGCTACAT CGCCCTGGCC 480 

AGGGCCATGC GCTGCAGCCT GTTCCGCACC AAGCACCACG CCCGGCTGAG CTGTGGCCTC 540 

ATCTGGATGG CATCCGTGTC AGCCACGCTG GTGCCCTTCA CCGCCGTGCA CCTGCAGCAC 600 

ACCGACGAGG CCTGCTTCTG TTTCGCGGAT GTCCGGGAGG TGCAGTGGCT CQAGGTCACG 660 

CTGGGCTTCA TCGTGCCCTT CGCCATCATC GGCCTGTGCT ACTCCCTCAT TGTCCGGGTG 720 

CTGGTCAGGG CGCACCGGCA CCGTGGGCTG CGGCCCCGGC GGCAGAAGGC GAAGCGCATG 780- 

ATCCTCGCGG TGGTGCTGGT CTTCTTCGTC TGCTGGCTGC CGGAGAACGT CTTCATCAGC 840 

GTGCACCTCC TGCAGCGGAC GCAGCCTCSGG GCCGCTCCCT GCAAGCAGTC TTTCCGCCAT 900 

GCCCACCCCC TCACGGGCCA CATTGTCAAC CTCACCGCCT TCTCCAACAG CTGCCTAAAC 960 

CCCCTCATCT ACAGCTTTCT CGGGGAGACC TTCAGGGACA AGCTGAGGCT GTACATTGAG 1020 

CAGAAAACAA ATTTGCCGGC CCTGAACCGC TTCTGTCACG CTGCCCTGAA GGCCGTCATT 1080 

CCA(3ACAGCA CCGAGCAGTC 6GATGTGAGG TTCAGCAGTG CCGTGTAG 1128 
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(203) INFORMATION FOR SBQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 
Met Asp Val Thr Ser Gin Ala Arg Gly Val Gly Leu Glu Met Tyr Pro 



10 



15 



Gly Thr Ala Gin Pro Ala Ala Pro Asn Thr Thr Ser Pro Glu 



20 25 



Leu Asn 
30 



Leu Ser His Pro Leu Leu Gly Thr Ala Leu Ala Asn Gly Thr Gly Glu 
35 40 45 

Leu ser Glu His Gin Gin Tyr Val He Gly Leu Phe Leu Ser Cys Leu 
5° 55 60 

Tyr Thr He Phe Leu Phe Pro He Gly Phe Val Gly Asn He Leu He 
65 70 75 



80 



Leu Val Val Asn He Ser Phe Arg Glu Lys Met Thr He Pro Asp Leu 



85 90 



95 



Tyr Phe He Asn Leu Ala Val Ala Asp Leu He Leu Val Ala Asp Ser 

105 

Leu He Glu Val Phe Asn Leu His Glu Arg Tyr Tyr Asp He Ala Val 
lis 120 125 

Leu Cys Thr Phe Met Ser Leu Phe Leu Gin Val Asn Met Tyr Ser Ser 
130 135 

Val Phe Phe Leu Thr Trp Met Ser Phe Asp Arg Tyr He Ala Leu Ala 
"5 150 155 

Arg Ala Met Arg Cys Ser Leu Phe Arg Thr Lys His His Ala Arg Leu 



165 



170 



175 



Ser Cys Gly Leu He Tip Met Ala Ser Val Ser Ala Thr Leu Val 



180 1S5 



Pro 
190 



Phe Thr Ala Val His Leu Gin His Thr Asp Glu Ala Cys Phe Cys Phe 
^55 200 205 

Ala Asp Val Arg Glu Val Gin Trp Leu Glu Val Thr Leu Gly Phe ile 

215 220 

Val Pro Phe Ala lie He Gly Leu Cys Tyr Ser Leu He Val Arg Val 
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225 230 235 240 

Leu Val Arg Ala His Arg His Arg Gly Leu Arg Pro Arg Arg Gin Lys 
245 250 255 

Ala Lys Arg Met lie Leu Ala Val Val Leu Val Phe Phe Val Cys Trp 
260 265 270 

Leu Pro Glu Asn Val Phe lie Ser Val His Leu Leu Gin Arg Thr Gin 
275 280 285 

Pro Gly TQa Ala Pro Cys Lys Gin Ser Phe Arg His Ala His Pro Leu 
290 295 300 

Thr Gly His lie Val Asn Leu Thr Ala Phe Ser Asn Ser Cys Leu Asn 
305 310 315 320 

Pro Leu lie Tyr Ser Phe Leu Gly Glu Thr Phe Arg Asp Lys Leu Arg 
325 330 335 

Leu Tyr lie Glu Gin Lys, Thr Asn Leu Pro Ala Leu Asn Arg Phe Cys 
340 345 350 

His Ala Ala Leu Lys Ala Val He Pro Asp Ser Thr Glu Gin Ser Asp 
355 360 365 

Val Arg Phe Ser Ser Ala Val 
370 375 

(204) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1137 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:203: 

ATGGACCTGG GGAAACCAAT GAAAAGCGTG CTGGTGGTGG CTCTCCTTGT CATTTTCCAG 60 

GTATGCCTGT GTCAAGATGA GGTCACGGAC GATTACATCG GAGACAACAC CACAGTGGAC 120 

TACACTTTGT TCGAGTCTTT GTGCTCCAAG AAGGACGTGC GGAACTTTAA AGCCTGGTTC 180 

CTCCCTATCA TGTACTCCAT CATTTGTTTC GTGGGCCTAC TGGGCAATGG GCTGGTCGTG 240 

TTGACCTATA TCTATTTCAA GAGGCTCAAG ACCATGACCG ATACCTACCT GCTCAACCTG 300 

GCGGTGGCAG ACATCCTCTT CCTCCTGACC CTTCCCTTCT GGGCCTACAG CGCGGCCAAG 360 

TCCTGGGTCT TCGGTGTCCA CTTTTGCAAG CTCATCTTTG CCATCTACAA GATGAGCTTC 420 
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TTCAGTGGCA TGCTCCTACT TCTTTGCATC AGCATT6ACC GCTACGTGGC CATC6TCCAG 480 

GCTGTCTCAG CTCACCGCCA CCGTGCCCGC GTCCTTCTCA TCA6CAAGCT GTCCTGTGTG 540 

GGCATCTGGA TACTAGCCAC AGTGCTCTCC ATCCCAGAGC TCCTGTACAG TGACCTCCAG 600 

AGGA6CAGCA GT6AGCAAGC GATGCGATGC TCTCTCATCA CAGAGCATGT GGAGGCCTTT 660 

ATCACCATCC AGGTGGCCCA GATGGTGATC GGCTTTCTOG TCCCCCTGCT GGCCATQAGC 720 

TTCTGTTACC TTGTCATCAT CCGCa^CCCTG CTCCAGGCAC 6CAACTTTGA GCGCAACAAG 780 

GCCAAAAAGG TGATCATCGC TGTGGTCGTG GTCTTCATAG TCTTCCAGCT GCCCTACAAT 840 

GGGGTGGTCC TGGCCCAGAC GGTGGCCAAC TTCAACATCA CCAGTA6CAC CTGTGA6CTC 900 

AGTAAGCAAC TCAACATCGC CTACGACGTC ACCTACAGCC TGGCCTGCGT CCGCTGCTGC 960 

GTCAACCCTT TCTTGTACGC CTTCATCGGC GTCAAGTTCC GCAACGATCT CTTCAAGCTC 1020 

TTCAAGGACC TGGGCTGCCT CAGCCAGGAG CAGCTCCGGC AGTGGTCTTC CTGTCGGCAC 1080 

ATCCGGCGCT CCTCCATQAG TGTGGA6GCC GAGACCACCA CCACCTTCTC CCCATAG 1137 
(205) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQDENCE CHARACTERISTICS: 

(A) LENGTH: 378 amino acids 

(B) TYPE: amino acid 

(C) STRAMDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 

Met Asp Leu Gly Lys Pro Met Lys Ser Val Leu Val Val Ala Leu Leu 

1 .5 10 



15 



Val He Phe Gin Val Cys Leu Cys Gin Asp Glu Val Thr Asp Asp Tyr 
20 25 30 

He Gly Asp Asn Thr Thr Val Asp Tyr Thr Leu Phe Glu Ser Leu Cys 
35 40 45 

Ser Lys Lys Asp Val Arg Asn Phe Lys Ala Trp Phe Leu Pro He Met 
50 55 60 

Tyr Ser He He Cys Phe Val Gly Leu Leu Gly Asn Gly Leu Val Val 
" 7° 75 80 

Leu Thr Tyr lie Tyr Phe Lys Arg Leu Lys Thr Met Thr Asp Thr Tyr 
85 90 95 

Leu Leu Asn Leu Ala Val Ala Asp He Leu Phe Leu Leu Thr Leu Pro 
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160 

105 



110 



Phe Trp Ala Tyr Ser Ala Ala Lys Ser Trp Val Phe Gly Val His Phe 
115 120 125 

Cys Lys Leu lie Phe Ala lie Tyr Lys Met Ser Phe Phe Ser Gly Met 
130 135 140 



10 



Leu Leu Leu Leu Cys lie Ser lie Asp Arg Tyr Val Ala lie Val Gin 

145 150 155 160 

Ala Val Ser. Ala His Arg His. Arg Ala Arg Val Leu Leu lie Ser Lys 

165 170 175 

Leu Ser Cys Val Gly He Trp lie Leu Ala Thr Val Leu Ser He Pro 

180 185 190 



Glu Leu Leu Tyr Ser Asp Leu Gin Arg Ser Ser Ser Glu Gin Ala Met 
195 200 205 



15 



Arg Cys Ser Leu He Thr Glu His Val Glu Ala Phe He Thr He Gin 
210 215 220 



Val Ala Gin Met Val He Gly Phe Leu Val Pro Leu Leu Ala Met Ser 
225 230 235 240 



20 



Phe Cys Tyr Leu Val He He Arg Thr Leu Leu Gin Ala Arg Asn Phe 
245 250 255 

Glu Arg Asn Lys Ala Lys Lys Val He He Ala Val Val Val Val Phe 
260 265 270 



25 



He Val Phe Gin Leu Pro Tyr Asn Gly Val Val Leu Ala Gin Thr Val 
275 280 285 

Ala Asn Phe Asn He Thr Ser Ser Thr Cys Glu Leu Ser Lys Gin Leu 
290 295 300 



30 



Asn He Ala Tyr Asp Val Thr Tyr Ser Leu Ala Cys Val Arg Cys Cys 
305 310 315 320 

Val Asn Pro Phe Leu Tyr Ala Phe He Gly Val Lys Phe Arg Asn Asp 
325 330 335 

Leu Phe Lys Leu Phe Lys Asp Leu Gly Cys Leu Ser Gin Glu Gin Leu 
340 345 350 

Arg Gin Trp Ser Ser Cys Arg His He Arg Arg Ser Ser Met Ser Val 
355 360 365 



35 



Glu Ala Glu Thr Thr Thr Thr Phe Ser Pro 
370 375 



(206) INFORMATION FOR SEQ ID NO: 205: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1086 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 



ATGGATATAC 


AAAT6GCAAA 


CAATTTTACT CCGCCCTCTG CAACTCCTCA 


GGGAAATGAC 


60 


TGTGACCTCT 


ATGCACATCA 


CAGCACGGCC AGGATAGTAA TGCCTCTGCA 


TTACAGCCTC 


120, 


GTCTTCATCA 


TTGGGCTCGT 


GGGAAACTTA CTAGCCTTGG TCGTCATTGT 


TCAAAACAGG 


180 


AAAAAAATCA ACTCTACCAC 


CCTCTATTCA ACAAATTTGG TGATTTCTGA 


TATACTTTTT 


240 


ACCACGGCTT 


TGCCTACACG 


AATAGCCTAC TATGCAATGG GCTTTGACTG 


GAGAATCGGA 


300 


GATGCCTTGT 


GTAGGATAAC 


TGCGCTAGTG TTTTACATCA ACACATATGC 


AGGTGTGAAC 


360 


TTTATGACCT 


GCCTGAGTAT 


TGACCGCTTC ATTGCTGTGG TGCACCCTCT 


ACGCTACAAC 


420 


AAGATJ\AAAA 


GGATTGAACA 


TGCAAAAGGC GTGTGCATAT TTGTCTGGAT 


TCTAGTATTT 


480 


GCTCAGACAC 


TCCCACTCCT 


CATCAACCCT ATGTCAAAGC AGGAGGCTGA 


AAGGATTACA 


540 


TGCATGGAGT 


ATCCAAACTT 


TGAAGAAACT AAATCTCTTC CCTGGATTCT 


GCTTGGGGCA 


600 


TGTTTCATAG 


GATATGTACT 


TCCACTTATA ATCATTCTCA TCTGCTATTC 


TCAGATCTGC 


660 


TGCAAACTCT 


TCAGAACTGC 


CAAACAAAAC CCACTCACTG AGAAATCTGG 


TGTAAACAAA 


720 


AAGGCTAAAA 


ACACAATTAT 


TCTTATTATT GTTGTGTTTG TTCTCTGTTT 


CACACCTTAC 


780 


CATGTTGCAA 


TTATTCAACA 


TATGATTAAG AAGCTTCGTT TCTCTAATTT 


CCTGGAATGT 


840 


AGCCAAAGAC 


ATTCGTTCCA 


GATTTCTCTG CACTTTACAG TATGCCTGAT 


GAACTTCAAT 


900 


TGCTGCATGG 


ACCCTTTTAT 


CTACTTCTTT GCATGTAAAG GGTATAAGAG 


AAAGGTTATG 


960 


AGGATGCTGA 


AACGGCAAGT 


CAGTGTATCG ATTTCTAGTG CTGTGAAGTC 


AGCCCCTGAA 


1020 


GAAAATTCAC GTGA7ATGAC 


AGAAACGCAG ATGATGATAC ATTCCAAGTC 


TTCAAATGGA 


1080 


AAGTGA 








1086 



(207) INFORMATION FOR SEQ ID N0:206: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 361 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 

Met Asp He Gin Met Ala Asn Asn Pho Thr Pro Pro Ser Ma Thr Pro 
1 5 10 15 

5 Gin Gly Asn Asp Cys Asp Leu Tyr Ala His His Ser Thr Ala Arg He 

20 25 30 

Val Met Pro Leu His Tyr Ser Leu Val Phe He He Gly Leu Val Gly 
35 40 45 

Asn Leu Leu Ala Leu Val Val He Val Gin Asn Arg Lys Lys He Asn 
10 50 55 60 . 

Ser Thr Thr Leu Tyr Ser Thr Asn Leu Val He Ser Asp He Leu Phe 
65 70 75 80 

Thr Thr Ala Leu Pro Thr Arg He Ala Tyr Tyr Ala Met Gly Phe Asp 
85 90 95 

15 Trp Arg He Gly Asp Ala Leu Cys Arg He Thr Ala Leu Val Phe Tyr 

100 105 110 

He Asn Thr Tyr Ala Gly Val Asn Phe Met Thr Cys Leu Ser He Asp 
115 120 125 



Arg Phe He Ala Val Val His Pro Leu Arg Tyr Asn Lys He Lys Arg 
20 130 135 140 

He Glu His Ala Lys Gly Val Cys He Phe Val Trp He Leu Val Phe 
145 150 155 160 

Ala Gin Thr Leu Pro Leu Leu He Asn Pro Met Ser Lys Gin Glu Ala 
165 170 175 

25 Glu Arg He Thr Cys Met Glu Tyr Pro Asn Phe Glu Glu Thr Lys Ser 

180 185 190 

Leu Pro Trp He Leu Leu Gly Ala Cys Phe He Gly Tyr Val Leu Pro 
195 200 205 

Leu He He He Leu He Cys Tyr Ser Gin He Cys Cys Lys Leu Phe 
30 210 215 220 

Arg Thr Ala Lys Gin Asn Pro Leu Thr Glu Lys Ser Gly Val Asn Lys 
225 230 235 240 

Lys Ala Lys Asn Thr He He Leu He He Val Val Phe Val Leu Cys 
245 250 255 



35 



Phe Thr Pro Tyr His Val Ala He He Gin His Met He Lys Lys Leu 
260 265 270 



10 



15 
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Arg Phe Ser Asn Phe Leu Glu Cys Ser Gin Arg His Ser Phe Gin He 
275 280 285 

ser Leu His Phe Thr Val Cys Leu Met Asn Phe Asn Cys Cys Met Asp 
290 295 300 

Pro Phe He Tyr Phe Phe Ala Cys Lys Gly Tyr Lys Arg Lys Val Met 

3" 315 320 

Arg Met Leu Lys Arg Gin Val Ser Val Ser He Ser Ser Ala Val Lys 
325 330 335 

ser Ala Pro Glu Glu Asn Ser Arg Glu Met Thr Glu Thr Gin Met Met 
340 345 350 

He His Ser Lys Ser Ser Asn Gly Lys 
355 360 

(208) INFORMATION FOR SEQ ID NO: 207: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1446 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECOLE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 07: 

ATGCGGTGGC TGTGGCCCCT QGCTGTCTCT CTTGCTGTGA TTTTGGCTGT GGGGCTAAGC 60 

AGGGTCTCTG GGGGTGCCCC CCTGCACCTG GGCAGGCACA GAGCCGAGAC CCAGGAGCAG 120 

CA6AGCCGAT CCAAGAGGGG CACCGAGGAT 6AGGAGGCCA AGGGCGTGCA GCAGTATGTG 180 

CCTGAGGAGT GGGCGGAGTA CCCCCGGCCC ATTCACCCTG CTGGCCT6CA GCCAACCAAG 240 

CCCTTGGTGG CCACCAGCCC TAACCCCGAC AAGGATGGGG GCACCCCAGA CAGTGGGCAG 300 

QAACTGAGGG GCAATCTGAC AGGGGCACCA GGGCAGAGGC TACAGATCCA GAACCCCCTG 360 

TATCCGGTGA CCGAGAGCTC CTACAGTGCC TATGCCATCA TGCTTCTGGC GCTGGTGGTG 420 

TTTGCGGTGG GCATTGTGGG CAACCTGTCG GTCATGTGCA TCGTGTGGCA CAGCTACTAC 480 

CTGAAGAGCO CCTGSAACTC CATCCTTGCC A6CCTGGCCC TCTGG6ATTT TCT6GTCCTC 540 

TTTTTCTGCC TCCCTATTGT CATCTTCAAC GAGATCACCA AGCAGAGGCT ACTGGGTX3AC 600 

GTTTCTTGTC GTGCCGTGCC CTTCATGGAG GTCTCCTCTC TGGGAGTCAC GACTTTCAGC 660 

CTCTGTGCCC TGGGCATTGA CCGCTTCCAC GTQGCCACCA QCACCCTOCC CAA6GTGAGG 720 

CCCATCGAGC GGT6CCAATC CATCCTGGCC AAGTTGGCTG TCATCTCGGT GGGCTCCATG 780 
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ACGCTGGCTG TGCCTGAGCT CCTGCTGTGG CAGCTGGCAC AGGAGCCTGC CCCCACCATG 840 

GGCACCCTGG ACTCATGCAT CATGAAACCC TCAGCCAGCC TGCCCGAGTC CCTGTATTCA 900 

CTGGTGATGA CCTACCAGAA CGCCCGCATG TGGTGGTACT TTGGCTGCTA CTTCTGCCTG 960 

CCCATCCTCT TCACAGTCAC CTGCCAGCTG GTGACATGGC GGGTGCGAGG CCCTCCAGGG 1020 

AGGAAGTCAG AGTGCAGGGC CAGCAAGCAC GAGCAGTGTG AGAGCCAGCT CAAGAGCACC 1080 

GTGGTGGGCC TGACCGTGGT CTACGCCTTC TGCACCCTCC CAGAGAACGT CTGCAACATC 1140 

GTGGTGGCCT ACCTCTCCAC CGAGCTGACC CGCCAGACCC TGGACCTCCT GGGCCTCATC 1200 

AACCAGTTCT CCACCTTCTT CAAGGGCGCC ATCACCCCAG TGCTGCTCCT TTGCATCTGC 1260 

AGGCCGCTGG GCCAGGCCTT • CCTGGACTGC TGCTGCTGCT GCTGCTGTGA GGAGTGCGGC 1320 

GGGGCTTCGG AGGCCTCTGC TGCCAATGGG TCGGACAACA AGCTCAAGAC CGAGGTGTCC 1380 

TCTTCCATCT ACTTCCACAA GCCCAGGGAG TCACCCCCAC TCCTGCCCCT GGGCACACCT 1440 

TGCTGA 1446 
(209) INF0Rf4ATI0N FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 481 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:208: 

Met Arg Trp Leu Trp Pro Leu Ala Val Ser Leu Ala Val lie Leu Ala 
1 5 10 15 

Val Gly Leu Ser Arg Val Ser Gly Gly Ala Pro Leu His Leu Gly Arg 
20 25 30 

His Arg Ala Glu Thr Gin Glu Gin Gin Ser Arg Ser Lys Arg Gly Thr 
35 40 45 

Glu Asp Glu Glu Ala Lys Gly Val Gin Gin Tyr Val Pro Glu Glu Trp 
50 55 60 

Ala Glu Tyr Pro Arg Pro lie His Pro Ala Gly Leu Gin Pro Thr Lys 
65 70 75 80 

Pro Leu Val Ala Thr Ser Pro Asn Pro Asp Lys Asp Gly Gly Thr Pro 
85 90 95 

Asp Ser Gly Gin Glu Leu Arg Gly Asn Leu Thr Gly Ala Pro Gly Gin 
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100 105 110 

Arg Leu Gin lie Gin Asn Pro Leu Tyr Pro Val Thr Glu Ser Ser Tyr 
115 120 125 

Ser Ala Tyr Ala He Met Leu Leu Ala Leu Val Val Phe Ala Val Gly 
130 135 140 

He Val Gly Asn Leu Ser Val Met Cys He Val Trp His Ser Tyr Tyr 
145 150 155 160 

Leu Lys Ser Ala Trp Asn Ser He Leu Ala Ser Leu Ala Leu Trp Asp 
165 170 175 

Phe Leu Val Leu Phe Phe Cys Leu Pro He Val He Phe Asn Glu He 
ISO 185 190 

Thr Lys Gin Arg Leu Leu Gly Asp Val Ser Cys Arg Ala Val Pro Phe 
195 200 205 

Met Glu Val Ser Ser Leu Gly Val Thr Thr Phe Ser Leu Cys Ala Leu 
210 215 220 

Gly He Asp Arg Phe His Val Ala Thr Ser Thr Leu Pro Lys Val Arg 
225 230 235 240 

Pro He Glu Arg Cys Gin Ser He Leu Ala Lys Leu Ala Val He Trp 
245 250 255 

Val Gly Ser Met Thr Leu Ala Val Pro Glu Leu Leu Leu Trp Gin Leu 
260 265 270 

Ala Gin Glu Pro Ala Pro Thr Met Gly Thr Leu Asp Ser Cys He Met 
275 280 285 

Lys Pro Ser Ala Ser Leu Pro Glu Ser Leu Tyr Ser Leu Val Met Thr 
290 295 300 

Tyr Gin Asn Ala Arg Met Trp Trp Tyr Phe Gly Cys Tyr Phe Cys Leu 
305 310 315 320 

Pro He Leu Phe Thr Val Thr Cys Gin Leu Val Thr Trp Arg Val Arg 
325 330 335 

Gly Pro Pro Gly Arg Lys Ser Glu Cys Arg Ala Ser Lys His Glu Gin 
340 345 350 

Cys Glu Ser Gin Leu Lys Ser Thr Val Val Gly Leu Thr Val Val Tyr 
355 360 365 

Ala Phe Cys Thr Leu Pro Glu Asn Val Cys Asn He Val Val Ala Tyr 
370 375 380 

Leu Ser Thr Glu Leu Thr Arg Gin Thr Leu Asp Leu Leu Gly Leu He 
385 390 395 400 
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Asn Gin Phe Ser Thr Phe Phe Lys Gly Ala lie Thr Pro Val Leu Leu 
405 410 415 

Leu Cys lie Cys Arg Pro Leu Gly Gin Ala Phe Leu Asp Cys Cys Cys 
420 425 430 

Cys Cys Cys Cys Glu Glu Cys Gly Gly Ala Ser Glu Ala Ser Ala Ala 
435 440 445 

Asn Gly Ser Asp Asn Lys Leu Lys Thr Glu Val Ser Ser Ser lie Tyr 
450 455 460 

Phe His Lys Pro Arg Glu Ser Pro Pro Leu Leu Pro Leu Gly Thr Pro 
465 470 475 480 

Cys 



(210) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1101 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209: 

ATGTGGAACG CGACGCCCAG CGAAGAGCCG GGGTTCAACC TCACACTGGC CGACCTGGAC 60 

TGGGATGCTT CCCCCGGCAA CGACTCGCTG GGCGACGAGC TGCTGCAGCT CTTCCCCGCG 120 

CCGCTGCTGG CGGGCGTCAC AGCCACCTGC GTGGCACTCT TCGTGGTGGG TATCGCTGGC 180 

AACCTGCTCA CCATGCTGGT GGTGTCGCGC TTCCGCGAGC TGCGCACCAC CACCAACCTC 240 

TACCTGTCCA GCATGGCCTT CTCCGATCTG CTCATCTTCC TCTGCATGCC CCTGGACCTC 300 

GTTCGCCTCT GGCAGTACCG GCCCTGGAAC TTCGGCGACC TCCTCTGCAA ACTCTTCCAA 360 

TTCGTCAGTG AGAGCTGCAC CTACGCCACG GTGCTCACCA TCACAGCGCT GAGCGTCGAG 420 

CGCTACTTCG CCATCTGCTT CCCACTCCG6 GCCAAGGTGG TGGTCACCAA GGGGC6GGTG 480 

AAGCTGGTCA TCTTCGTCAT CTGGGCC6TG GCCTTCTGCA GCGCCGGGCC CATCTTCGTG 540 

CTAGTCGGGG TGGAGCACGA GT^CGGCACC GACCCTTGGG ACACCAACGA GTGCCGCCCC 600 

ACCGAGTTTG CGGTGCGCTC TGGACTGCTC ACGGTCATGG TGTGGGTGTC CAGCATCTTC 660 

TTCTTCCTTC CTGTCTTCTG TCTCACGGTC CTCTACAGTC TCATCGGCAG GAAGCTGTGG 720 

CGGAGGAGGC GCGGCGATGC TGTCGTGGGT GCCTCGCTCA GGGACCAGAA CCACAAGCJUV 780 
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ACCAAGAAAA TGCTGGCTGT AGTGGTGTTT GCCTTCATCC TCTGCTGGCT CCCCTTCCAC 840 
GTA6GGCGAT ATTTATTTTC CAAATCCTTT GAGCCTCGCT CCTTGGAGAT TGCTCAGATC 900 
AGCCAGTACT GCAACCTCGT GTCCTTTGTC CTCTTCTACC TCAGTGCTGC CATCAACCCC 960 
ATTCTGTACA ACATCATGTC CAAGAAGTAC CGGGTGGCAG TGTTCAGACT TCTGGGATTC 1020 
GAACCCTTCT CCCAGAGAAA GCTCTCCACT CTGAAAGATG AAAGTTCTCG GGCCTGGACA 1080 

GAATCTAGTA TTAATACATG A , , 

1101 

(211) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 366 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 10: 

Met Trp Asn Ala Thr Pro Ser Glu Glu Pro Gly Phe Asn Leu Thr Leu 
1 5 10 



15 



Ala Asp Leu Asp Trp Asp Ala Ser Pro Gly Asn Asp Ser Leu Gly Asp 
20 25 30 

Glu Leu Leu Gin Leu Phe Pro Ala Pro Leu Leu Ala Gly Val Thr Ala 
35 40 45 

Thr Cys Val Ala Leu Phe Val Val Gly lie Ala Gly Asn Leu Leu Thr 
50 55 go 

Met Leu Val Val Ser Arg Phe Arg Glu Leu Arg Thr Thr Thr Asn Leu 
^5 70 75 80 

Tyr Leu Ser Ser Met Ala Phe Ser Asp Leu Leu He Phe Leu Cys Met 
85 90 95 

Pro Leu Asp Leu Val Arg Leu Trp Gin Tyr Arg Pro Trp Asn Phe Gly 
100 105 110 

Asp Leu Leu Cys Lys Leu Phe Gin Phe Val Ser Glu Ser Cys Thr Tyr 
115 120 125 

Ala Thr Val Leu Thr He Thr Ala Leu Ser Val Glu Arg Tyr Phe Ala 
130 135 140 



He Cys Phe Pro Leu Arg Ala Lys Val Val Val Thr Lys Gly Ara Val 

"0 155 160 

Lys Leu Val lie Phe Val He Trp Ala Val Ala Phe Cys Ser Ala Gly 
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170 , 

Pro lie Phe Val Leu Val Gly Val Glu His Glu Asn Gly Thr Asp Pro 
180 185 190 

Trp Asp Thr Asn Glu Cys Arg Pro Thr Glu Phe Ala Val Arg Ser Gly 
195 200 205 

Leu Leu Thr Val Met Val Trp Val Ser Ser He Phe Phe Phe Leu Pro 
2i0 215 220 

Val Phe Cys Leu Thr Val Leu Tyr Ser Leu He Gly Arg Lys Leu Trp 
225 230 235 240 

Arg Arg Arg Arg Gly Asp Ala Val Val Gly Ala Ser Leu Arg Asp Gin 
245 250 255 

Asn His Lys Gin Thr Lys Lys Met Leu Ala Val Val Val Phe Ala Phe 
260 265 270 

He Leu Cys Trp Leu Pro Phe His Val Gly Arg Tyr Leu Phe Ser Lys 
275 280 285 

Ser Phe Glu Pro Gly Ser Leu Glu He Ala Gin He Ser Gin Tyr Cys • 
290 295 300 

Asn Leu Val Ser Phe Val Leu Phe Tyr Leu Ser Ala Ala He Asn Pro 

310 315 320 

He Leu Tyr Asn He Met Ser Lys Lys Tyr Arg Val Ala Val Phe Arg 
325 330 335 

Leu Leu Gly Phe Glu Pro Phe Ser Gin Arg Lys Leu Ser Thr Leu Lys 
340 345 35Q 

Asp Glu Ser Ser Arg Ala Trp Thr Glu Ser Ser He Asn Thr 
355 360 365 

(212) INFORMATION FOR SEQ ID N0:211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 

ATGCGAGCCC CGGGCGCGCT TCTCGCCCGC ATGTCGCGGC TACTGCTTCT GCTACTGCTC 60 

AAGGTGTCTG CCTCTTCTGC CCTCGGGGTC GCCCCTGCGT CCAGAAACGA AACTTGTCTG 120 

GGGGAGAGCT GTGCACCTAC AGTGATCCAG CGCCGCGGCA GGGACGCCTG GGGACCGGGA 180 
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AATTCTGCAA 


GAGACGTTCT 


GCGAGCCCGA 


GCACCCAGGG 


AGGAGCAGGG 


GGCAGCGTTT 


240 


CTTGCGGGAC 


CCTCCTGGGA 


CCTGCCGGCG 


GCCCCGGGCC 


GTGACCCGGC 


TGCAGGCAGA 


300 


GGGGCGGAGG 


CGTCGGCAGC 


CGGACCCCCG 


GGACCTCCAA 


CCAGGCCACC 


TGGCCCCTGG 


360 


AGGTGGAAAG 


GTGCTCGGGG 


TCAGGAGCCT 


TCTGAAACTT 


TGGGGAGAGG 


GAACCCCACG 


420 


GCCCTCCAGC 


TCTTCCTTCA 


GATCTCAGAG 


GAGGAAGAGA AGGGTCCCAG 


AGGCGCTGGC 


480 


ATTTCCGGGC 


GTAGCCAGGA 


GCAGAGTGTG 


AAGACAGTCC 


CCGGAGCCAG 


CGATCTTTTT 


540 


TACTGGCCAA 


GGAGAGCCGG 


GAAACTCCAG 


GGTTCCCACC 


ACAAGCCCCT 


C3TCCAAGACG 


600 


GCCAATGGAC 


TGGCGGGGCA 


CGAAGGGTGG 


ACAATTGCAC 


TCCCGGGCCG 


GGCGCTGGCC 


660 


CAGAATGGAT 


CCTTGGGTGA AGGAATCCAT 


GAGCCTGGGG 


GTCCCCGCCG 


GGGAAACAGC 


720 


ACGAACCGGC 


GTGTGAGACT 


GAAGAACCCC 


TTCTACCCGC 


TGACCCAGGA 


GTCCTATGGA 


780 


GCCTACGCGG 


TCATGTGTCT 


GTCCGTGGTG 


ATCTTCGGGA 


CCGGCATCAT 


TGGCAACCTG 


840 


GCGGTGATGT 


GCATCGTGTG 


CCACAACTAC 


TACATGCGGA 


GCATCTCCAA 


CTCCCTCTTG 


900 


GCCAACCTGG 


CCTTCTGGGA 


CTTTCTCATC 


ATCTTCTTCT 


GCCTTCCGCT 


GGTCATCTTC 


960 


CACGAGCTGA 


CCAAGAAGTG 


GCTGCTGGAG 


GACTTCTCCT 


GCAAGATCGT 


GCCCTATATA 


1020 


GAGGTCGCCT 


CTCTGGGAGT 


CACCACTTTC 


ACCTTATGTG 


CTCTGTGCAT 


AGACCGCTTC 


1080 


CGTGCTGCCT^ 


CCAACGTACA 


GATGTACTAC 


GAAATGATCG 


AAAATTGTTC 


CTCAACAACT 


1140 


GCCAAACTTG 


CTGTTATATG 


GGTGGGAGCT 


CTATTGTTAG 


CACTTCCAGA 


AGTTGTTCTC 


1200 


CGCCAGCTGA 


GCAAGGAGGA 


TTTGGGGTTT 


AGTGGCCGAG 


CTCCGGCAGA 


AAGGTGCATT 


1260 


ATTAAGATCT 


CTCCTGATTT 


ACCAGACACC 


ATCTATGTTC 


TAGCCCTCAC 


CTACGACAGT 


1320 


GCGAGACTGT 


GGTGGTATTT 


TGGCTGTTAC 


TTTTGTTTGC 


CCACGCTTTT 


CACCATCACC 


1380 


TGCTCTCTAG 


TGACTGCGAG GAAAATCCGC AAAGCAGAGA AAGCCTGTAC 


CCGAGGGAAT 


1440 


AAACGGCAGA 


TTCAACTAGA 


GAGTCAGATG 


AAGTGTACAG 


TAGTGGCACT 


GACCATTTTA 


1500 


TATGGATTTT 


GCATTATTCC 


TGAAJ\ATATC 


TGCAACATTG 


TTACTGCCTA 


CATGGCTACA 


1560 


GGGGTTTCAC 


AGCAGACAAT 


GGACCTCCTT 


AATATCATCA 


GCCAGTTCCT 


TTTGTTCTTT 


1620 


AAGTCCTGTG 


TCACCCCAGT 


CCTCCTTTTC 


TGTCTCTGCA 


AACCCTTCAG 


TCGGGCCTTC 


1680 


ATGGAGTGCT 


GCTGCTGTTG CTGTGAGGAA TGCATTCAGA AGTCTTCAAC 


GGTGACCAGT 


1740 


GATGACAATG 


ACAACGAGTA 


CACCACGGAA 


CTCGAACTCT 


CGCCTTtCAG 


TACCATACGC 


1800 


CGTGAAATGT CCACTTTTGC TTCTGTCGGA ACTCATTGCT GA 




1842 
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(213) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:212: 

Met Arg Ala Pro Gly Ala Leu Leu Ala Arg Met Ser Arg Leu Leu Leu 
1 5 - . 10 15 

Leu Leu Leu Leu Lys Val Ser Ala Ser Ser Ala Leu Gly Val Ala Pro 
20 25 30 

Ala Ser Arg Asn Glu Thr Cys Leu Gly Glu Ser Cys Ala Pro Thr Val 
35 40 45 

lie Gin Arg Arg Gly Arg Asp Ala Trp Gly Pro Gly Asn Ser Ala Arg 
50 55 60 

Asp Val Leu Arg Ala Arg Ala Pro Arg Glu Glu Gin Gly Ala Ala Phe 
65 70 75 80 

Leu Ala Gly Pro Ser Trp Asp Leu Pro Ala Ala Pro Gly Arg Asp Pro 
85 90 95 

Ala Ala Gly Arg Gly Ala Glu Ala Ser Ala Ala Gly Pro Pro Gly Pro 
100 105 110 

Pro Thr Arg Pro Pro Gly Pro Trp Arg Trp Lys Gly Ala Arg Gly Gin 
115 120 125 

Glu Pro Ser Glu Thr Leu Gly Arg Gly Asn Pro Thr Ala Leu Gin Leu 
130 135 140 

Phe Leu Gin lie Ser Glu Glu Glu Glu Lys Gly Pro Arg Gly Ala Gly 
145 150 155 160 

lie Ser Gly Arg Ser Gin Glu Gin Ser Val Lys Thr Val Pro Gly Ala 
165 170 175 

Ser Asp Leu Phe Tyr Trp Pro Arg Arg Ala Gly Lys Leu Gin Gly Ser 
180 185 190 

His His Lys Pro Leu Ser Lys Thr Ala Asn Gly Leu Ala Gly His Glu 
195. 200 205 

Gly Trp Thr lie Ala Leu Pro Gly Arg Ala Leu Ala Gin Asn Gly Ser 
210 215 220 

Leu Gly Glu Gly lie His Glu Pro Gly Gly Pro Arg Arg Gly Asn Ser 
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225 



230 



235 



240 



Thr Asn Arg Arg Val Arg Leu Lys Asn Pro Phe Tyr Pro Leu Thr Gin 
245 250 255 

Glu Ser Tyr Gly Ala Tyr Ala Val Met Cys Leu Ser Val Val He Phe 
260 265 270 

Gly Thr Gly He He Gly Asn Leu Ala Val Met Cys He Val Cys His 
275 280 285 

Asn Tyr Tyr Met Arg Ser He Ser Asn Ser Leu Leu Ala Asn Leu Ala 
290 295 300 

Phe Trp Asp Phe Leu He He Phe Phe Cys Leu Pro Leu Val He Phe 

310 3X5 320 

His Glu Leu Thr Lys Lys Trp Leu Leu Glu Asp Phe Ser Cys Lys He 
325 ■ 330 335 

Val Pro Tyr He Glu Val Ala Ser Leu Gly Val Thr Thr Phe Thr Leu 
340 345 350 

Cys Ala Leu Cys He Asp Arg Phe Arg Ala Ala Thr Asn Val Gin Met 
355 360 365 

Tyr Tyr Glu Met He Glu Asn Cys Ser Ser Thr Thr Ala Lys Leu Ala 
370 375 380 

Val He Trp Val Gly Ala Leu Leu Leu Ala Leu Pro Glu Val Val Leu 

390 395 400 



Arg Gin Leu Ser Lys Glu Asp Leu Gly Phe Ser Gly Arg Ala Pro Ala 
405 410 

Glu Arg Cys He He Lys He Ser Pro Asp Leu Pro Asp Thr He Tyr 



420 



425 



430 



Val Leu Ala Leu Thr Tyr Asp Ser Ala Arg Leu Trp Trp Tyr Phe Gly 
435 440 



445 



Cys Tyr Phe Cys Leu Pro Thr Leu Phe Thr He Thr Cys Ser Leu Val 
450 455 460 

Thr. Ala Arg Lys He Arg Lys Ala Glu Lys Ala Cys Thr Arg Gly Asn 

470 475 480 

Lys Arg Gin He Gin Leu Glu Ser Gin Met Lys Cys Thr Val Val Ala 
485 490 

Leu Thr He Leu Tyr Gly Phe Cys He He Pro Glu Asn He Cys Asn 
500 505 510 

He Val Thr Ala Tyr Met Ala Thr Gly val Ser Gin Gin Thr Met Asp 
515 520 525 
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Leu Leu Asn lie lie Ser Gin Phe Leu Leu Phe Phe Lys Ser Cys Val 
530 535 540 

Thr Pro Val Leu Leu Phe Cys Leu Cys Lys Pro Phe Ser Arg Ala Phe 
545 550 555 560 

Met Glu Cys Cys Cys Cys Cys Cys Glu Glu Cys lie Gin Lys Ser Ser 
565 570 575 

Thr Val Thr Ser Asp Asp Asn Asp Asn Glu Tyr Thr Thr Glu Leu Glu 
580 585 590 

Leu Ser Pro Phe Ser Thr He Arg Arg Glu Met Ser Thr Phe Ala Ser 
595 600 605 

Val Gly Thr His Cys 
610 

(214) INFORMATION FOR SEQ ID NO: 2 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1248 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECtJLE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 13: 

ATGGTTTTTG CTCACAGAAT GGATAACAGC AAGCCACATT TGATTATTCC TACACTTCTG 60 

GTGCCCCTCC AAAACCGCAG CTGCACTGAA ACAGCCACAC CTCTGCCAAG CCAATACCTG 120 

ATGGAATTAA GTGAGGAGCA CAGTTGGATG AGCAACCAAA CAGACCTTCA CTATGTGCTG 180 

AAACCCGGGG AAGTGGCCAC AGCCAGCATC TTCTTTGGGA TTCTGTGGTT GTTTTCTATC 240 

TTCGGCAATT CCCTGGTTTG TTTGGTCATC CATAGGAGTA GGAGGACTCA GTCTACCACC 300 

AACTACTTTG TGGTCTCCAT GGCATGTGCT GACCTTCTCA TCAGCGTTGC CAGCACGCCT 360 

TTCGTCCTGC TCCAGTTCAC CACTGGAAGG TGGACGCTGG GTAGTGCAAC GTGCAAGGTT 420 

GTGCGATATT TTCAATATCT CACTCCAGGT GTCCAGATCT ACGTTCTCCT CTCCATCTGC 480 

ATAGACCGGT TCTACACCAT CGTCTATCCT CTGAGCTTCA AGGTGTCCAG AGAAAAAGCC 540 

AAGAAAATGA TTGCGGCATC GTGGATCTTT GATGCAGGCT TTGTGACCCC TGTGCTCTTT 600 

TTCTATGGCT CCAACTGGGA CAGTCATTGT AACTATTTCC TCCCCTCCTC TTGGGAAGGC 660 

ACTGCCTACA CTGTCATCCA CTTCTTGGTG GGCTTTGTGA TTCCATCTGT CCTCATAATT 720 

TTATTTTACC AAAAGGTCAT AAAATATATT TGGAGAATAG GCACAGATGG CCGAACGGTG 780 
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AGGAGGACAA TGAACATTGT CCCTCGGAC3V AAAGTCAAAA CTAAAAAGAT GTTCCTCATT 840 
TTAAATCTGT TGTTTTTGCT CTCCTGGCTG CCTTTTCATG TAGCTCAGCT ATGGCACCCC 900 
CATGAACAAG ACTATAAGAA AAGTTCCCTT GTTTTCACAG CTATCACATG GATATCCTTT 260 
AGTTCTTCAG CCTCTAAACC TACTCTGTAT TCAATTTATA ATGCCAATTT TCGGAGAGGG 1020 
ATGAAAGAGA CTTTTTGCAT GTCCTCTATG AAATCTTACC GAAGCAATGC CTATACTATC 1080 
ACAACAAGTT CAAGGATGGC CAAAAAAAAC TACGTTGGCA TTTCAOAAAT CCCTTCCATG 1140 
GCCAAAACTA TTACCAAAGA CTCGATCTAT GACTCATTTG ACAGAGAAGC CAAGGAAAAA 1200 
AAGCTTGCTT GGCCCATTAA CTCAAATCCA CCAAATACTT TTGTCTAA 1248 
(215) INFORMATION FOR SEQ ID NO: 214: 

(i) SEQUENCE C3IARACTERISTICS : 

(A) LENGTH: 415 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 

Met Val Phe Ala His Arg Met Asp Asn Ser Lys Pro His Leu He He 
1 5 10 



15 



Pro Thr Leu Leu Val Pro Leu Gin Asn Arg Ser Cys Thr Glu Thr Ala 
20 25 30 

Thr Pro Leu Pro Ser Gin Tyr Leu Met Glu Leu Ser Glu Glu His Ser 
35-40 45 

Trp Met Ser Asn Gin Thr Asp Leu His Tyr Val Leu Lys Pro Gly Glu 
50 55 60 

Val Ala Thr Ala Ser He Phe Phe Gly He Leu Trp Leu Phe Ser He 
" 70 75 

Phe Gly Asn Ser Leu Val Cys Leu Val He His Arg Ser Arg Arg Thr 
^5 90 95 

Gin ser Thr Thr Asn Tyr Phe Val Val Ser Met Ala Cys Ala Asp Leu 
"0 105 110 

Leu He Ser Val Ala Ser Thr Pro Phe Val Leu Leu Gin Phe Thr Thr 
115 120 125 

Gly Arg Trp Thr Leu Gly Ser Ala Thr Cys Lys Val Val Arg Tyr Phe 
130 135 
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Gin Tyr Leu Thr Pro Gly Val Gin lie Tyr Val Leu Leu Ser lie Cys 
145 150 155 160 

lie Asp Arg Phe Tyr Thr lie Val Tyr Pro Leu Ser Phe Lys Val Ser 
165 170 175 

Arg Glu Lys Ala Lys Lys Met lie Ala Ala Ser Trp lie Phe Asp Ala 
180 185 190 

Gly Phe Val Thr Pro Val Leu Phe Phe Tyr Gly Ser Asn Trp Asp Ser 
195 200 205 

His Cys Asn Tyr Phe Leu Pro Ser Ser Trp Glu Gly Thr Ala Tyr Thr 
210 215 220 

Val lie His Phe Leu Val Gly Phe Val lie Pro Ser Val Leu He He 
225 230 235 240 

Leu Phe Tyr Gin Lys Val He Lys Tyr He Trp Arg He Gly Thr Asp 
245 250 255 

Gly Arg Thr Val Arg Arg Thr Met Asn He Val Pro Arg Thr Lys Val 
260 265 270 

Lys Thr Lys Lys Met Phe Leu He Leu Asn Leu Leu Phe Leu Leu Ser 
275 280 285 

Trp Leu Pro Phe His Val Ala Gin Leu Trp His Pro His Glu Gin Asp 
290 295 300 

Tyr Lys Lys Ser Ser Leu Val Phe Thr Ala He Thr Trp He Ser Phe 
305 310 315 320 

Ser Ser Ser Ala Ser Lys Pro Thr Leu Tyr Ser He Tyr Asn Ala Asn 
325 330 335 

Phe Arg Arg Gly Met Lys Glu Thr Phe Cys Met Ser Ser Met Lys Cys 
340 345 350 

Tyr Arg Ser Asn Ala Tyr Thr He Thr Thr Ser Ser Arg Met Ala Lys 
355 360 365 

Lys Asn Tyr Val Gly He Ser Glu He Pro Ser Met Ala Lys Thr He 
370 375 380 

Thr Lys Asp Ser He Tyr Asp Ser Phe Asp Arg Glu Ala Lys Glu Lys 
385 390 395 400 

Lys Leu Ala Trp Pro He Asn Ser Asn Pro Pro Asn Thr Phe Val 
405 410 415 

(216) INFORMATION FOR SEQ ID NO: 215: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1842 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 

ATGGGGCCCA CCCTAGCGGT TCCCACCCCC TATGGCTGTA TrGGCTGTAA GCTACCCCAG 60 
CCAGAATACC CACCGGCTCT AATCATCTTT ATCTTCTGCG CGATGGTTAT CACCATCGTT 120 
GTAGACCTAA TCGGCAACTC CATGGTCATT TTGGCTGTGA CGAAGAACAA GAAGCTCCGG 180 
AATTCTGGCA ACATCTTCGT GGTCAGTCTC TCTGTGGCCG ATATGCTGGT GGCCATCTAC 240 
10 CCATACCCTT TGATGCTGCA TGCCATGTCC ATTGGGGGCT GGGATCTGAG CCAGTTACAG 300 
TGCCAGATGG TCGGGTTCAT CACAGGGCTG AGTGTGGTCG GCTCCATCTT CAACATCGTG 360 
GCAATCGCTA TCAACCGTTA CTGCTACATC TGCCACAGCC TCCAGTACGA ACGGATCTTC 420 
AGTGTGCGCA ATACCTGCAT CTACCTGGTC ATCACCTGGA TCATGACCGT CCTGGCTGTC 480. 
CTGCCCAACA TGTACATTGG CACCATCGAG TACGATCCTC GCACCTACAC CTGCATCTTC 540 
15 AACTATCTGA ACAACCCTGT CTTCACTGTT ACCATCGTCT GCATCCACTT CGTCCTCCCT 600 
CTCCTCATCG TGGGTTTCTG CTACGTGAGG ATCTGGACCA AAGTGCTGGC GGCCCGTGAC 660 
CCTGCAGGGC AGAATCCTGA CAACCAACTT GCTGAGGTTC GCAATAAACT AACCATGTTT 720 
GTGATCTTCC TCCTCTTTGC AGTGTGCTGG TGCCCTATCA ACGTGCTCAC TGTCTTGGTG 780 
GCTGTCAGTC CGAAGGAGAT GGCAGGCAAG ATCCCCAACT GGCTTTATCT TGCAGCCTAC 840 
20 TTCATAGCCT ACTTCAACAG CTGCCTCAAC GCTGTGATCT ACGGGCTCCT CAATGAGAAT 900 
TTCCGAAGAG AATACTGGAC CATCTT^CAT GCTATGCGGC ACCCTATCAT ATTCTTCTCT 960 
OGCCTCATCA GTGATATTCG TGAGATGCAG GAGGCCCGTA CCCTGGCCCG CGCCCGTCCC 1020 
CATGCTCGCG ACCAAGCTCG TGAACAAGAC CGTGCCCATG CCTGTCCTGC TGTGGAGGAA 1080 
ACCCCGATGA ATGTCCGGAA TGTTCCATTA CCTGGTGATg" CTGCAGCTGG CCACCCCGAC 1140 
25 CGTGCCTCTG GCCACCCTAA GCCCCATTCC AGATCCTCCT CTGCCTATCG CAAATCTGCC 1200 
TCTACCCACC ACAAGTCTGT CTTTAGCCAC TCCAAGGCTG CCTCTGGTCA CCTCAAGCCT 1260 
GTCTCTGGCC ACTCCAAGCC TGCCTCTGGT CACCCCAAGT CTGCCACTGT CTACCCTAAG 1320 
CCTGCCTCTG TCCATTTCAA GGCTGACTCT GTCCATTTCA AGG6TGACTC TGTCCATTTC 1380 
AAGCCTGACT CTGTTCATTT CAAGCCTGCT TCCAGCAACC CCAAGCCCAT CACTGGCCAC 1440 
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CATGTCTCTG CTGGCAGCCA CTCCTJVGTCT GCCTTCAATG CTGCCACCAG CCACCCTAAA 1500 

CCCATCAAGC CAGCTACCAG CCATGCTGAG CCCACCACTG CTGACTATCC CAAGCCTGCC 1560 

ACTACCAGCC ACCCTAAGCC CGCTGCTGCT GACAACCCTG AGCTCTCTGC CTCCCATTGC 1620 

CCCGAGATCC CTGCCATTGC CCACCCTGTG TCTGACGACA GTGACCTCCC TGAGTCGGCC 1680 

TCTAGCCCTG CCGCTGGGCC CACCAAGCCT GCTGCCAGCC AGCTGGAGTC TGACACCATC 1740 

GCTGACCTTC CTGACCCTAC TGTAGTCACT ACCAGTACCA ATGATTACCA TGATGTCGTG 1800 

GTTGTTGATG TTGAAGATGA TCCTGATGAA ATGGCTGTGT GA 1842 
(217) INFORMATION FOR SEQ ID NO: 216: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216: 

Met Gly Pro Thr Leu Ala Val Pro Thr Pro Tyr Gly Cys lie Gly Cys 
1 5 10 15 

Lys Leu Pro Gin Pro Glu Tyr Pro Pro Ala Leu lie lie Phe Met Phe 
20 25 30 

Cys Ala Met Val lie Thr lie Val Val Asp Leu lie Gly Asn Ser Met 
35 40 45 

Val lie Leu Ala Val Thr Lys Asn Lys Lys Leu Arg Asn Ser Gly Asn 
50 55 60 

He Phe Val Val Ser Leu Ser Val Ala Asp Met Leu Val Ala He Tyr 
65 70 75 80 

Pro Tyr Pro Leu Met Leu His Ala Met Ser He Gly Gly Trp Asp Leu 
85 90 95 

Ser Gin Leu Gin Cys Gin Met Val Gly Phe He Thr Gly Leu Ser Val 
100 105 110 

Val Gly Ser He Phe Asn He Val Ala He Ala He Asn Arg Tyr Cys 
115 120 125 

Tyr He Cys His Ser Leu Gin Tyr Glu Arg He Phe Ser Val Arg Asn 
130 135 140 

Thr Cys He Tyr Leu Val He Thr Trp He Met Thr Val Leu Ala Val 
145 150 155 160 
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Leu Pro Asn Met Tyr lie Gly Thr He Glu Tyr Asp Pro Arg Thr Tyr 
165 170 

Thr Cys He Phe Asn Tyr Leu Asn Asn Pro Val Phe Thr Val Thr He 
180 185 190 

Val Cys He His Phe Val Leu Pro Leu Leu He Val Gly Phe Cys Tyr 
195 200 205 

Val Arg He Trp Thr Lys Val Leu. Ala Ala Arg Asp Pro Ala Gly Gin 
210 215 220 

Asn Pro Asp Asn Gin Leu Ala Glu Val Arg Asn Lys Leu Thr Met Phe 
225 230 235 240 

Val He Phe Leu Leu Phe Ala Val Cys Trp Cys Pro He Asn Val Leu 
245 250 255 

Thr Val Leu Val Ala Val Ser Pro Lys Glu Met Ala Gly Lys He Pro 
260 265 270 

Asn Trp Leu Tyr Leu Ala Ala Tyr Phe He Ala Tyr Phe Asn Ser Cys 
275 280 285 

Leu Asn Ala Val He Tyr Gly Leu Leu Asn Glu Asn Phe Arg Arg Glu 
290 295 300 

Tyr Trp Thr He Phe His Ala Met Arg His Pro He He Phe Phe Ser 

310 315 320 

Gly Leu He Ser Asp He Arg Glu Met Gin Glu Ala Arg Thr Leu Ala 
325 330 335 

Arg Ala Arg Ala His Ala Arg Asp Gin Ala Arg Glu Gin Asp Arg Ala 
340 345 

His Ala Cys Pro Ala Val Glu Glu Thr Pro Met Asn Val Arg Asn Val 
355 360 365 

Pro Leu Pro Gly Asp Ala Ala Ala Gly His Pro Asp Arg Ala Ser Gly 
370 375 380 

His Pro Lys Pro His Ser Arg Ser Ser Ser Ala Tyr Arg Lys Ser Ala 

390 395 400 

Ser Thr His His Lys Ser Val Phe Ser His Ser Lys Ala Ala Ser Gly 
405 410 415 

His Leu Lys Pro Val Ser Gly His Ser Lys Pro Ala Ser Gly His Pro 
420 425 430 

Lys Ser Ala Thr Val Tyr Pro Lys Pro Ala Ser Val His Phe Lys Ala 
435 440 445 

Asp Ser Val His Phe Lys Gly Asp Ser Val His Phe Lys Pro Asp Ser 
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450 

Val His Phe Lys Pro 
465 

His Val Ser Ala Gly 
485 

Ser His Pro Lys Pro 
500 

Thr Ala Asp Tyr Pro 
515 

Ala Ala Asp Asn Pro 
530 

Ala lie Ala His Pro 
545 

Ser Ser Pro Ala Ala 
565 

Ser Asp Thr lie Ala 
580 

Thr Asn Asp Tyr His 
595 

Asp Glu Met Ala Val 
610 

(218) INFORMATION FOR SEQ ID NO: 217: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1854 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 

ATGGGGCCCA CCCTAGCGGT TCCCACCCCC TATGGCTGTA TTGGCTGTAA GCTACCCCAG 60 

CCAGAATACC CACCGGCTCT AATCATCTTT ATGTTCTGCG CGATGGTTAT CACCATCGTT 120 

GTAGACCTAA TCGGCAACTC CATGGTCATT TTGGCTGTGA CGAAGAACAA GAAGCTCCGG 180 

AATTCTGGCA ACATCTTCGT GGTCAGTCTC TCTGTGGCCG ATATGCTGCST GGCCATCTAC 240 

CCATACCCTT TGATGCTGCA T6CCATGTCC ATTGGGGGCT GGGATCTGAG CCAGTTACAG 300 

TGCCAGATGG TCGGGTTCAT CACAGGGCTG AGTGTGGTCG GCTCCATCTT CAACATCGTG 360 



455 



460 



Ala Ser Ser Asn Pro Lys Pro lie Thr Gly His 
470 475 480 

Ser His Ser Lys Ser Ala Phe Asn Ala Ala Thr 
490 495 

lie Lys Pro Ala Thr Ser His Ala Glu Pro Thr 
505 510 

Lys Pro Ala Thr Thr Ser His Pro Lys Pro Ala 
520 525 

Glu Leu Ser Ala Ser His Cys Pro Glu lie Pro 
535 540 

Val Ser Asp Asp Ser Asp Leu Pro Glu Ser Ala 
550 555 560 

Gly Pro Thr Lys Pro Ala Ala Ser Gin Leu Glu 
570 575 

Asp Leu Pro Asp Pro Thr Val Val Thr Thr Ser 
585 590 



Asp Val Val Val Val Asp Val Glu Asp Asp Pro 
600 605 
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GCAATCGCTA TCAACCGITA CTGCTACATC TGCCACAGCC TCCA6TACGA ACGGATCTTC 420 
AGTCTGCGCA ATACCTGCAT CTACCTGGTC ATCACCTGGA TCATGACCGT CCTGGCTCTC .480 
CTGCCCAACA TGTACATTGG CACCATCGAG TACGATCCTC GCACCTACAC CTGCATCTTC 540 
AACTATCTGA ACAACCCTGT CTTCACTGTT ACCATCGTCT GCATCCACTT CGTCCTCCCT 600 
CTCCTCATCG TGGGTTTCTG CTACGTGAGG ATCTGGACCA AAGTGCTGGC GGCCCGTGAC 660 
CCTGCAGGGC AGAATCCTGA CAACCAACTT GCTGAGGTTC GCAATAAACT AACCATGTTT 720 
GTGATCTTCC TCCTCTTTGC AGTGTGCTGG TGCCCTATCA ACGTGCTCAC TGTCTTGGTG 780 
GCTGTCAGTC CGAAGGAGAT GGCAGGCAAG ATCCCCAACT GGCTTTATCT TGCAGCCTAC 840 
TTCaiTAGCCT ACTTCAACAG CTGCCTCAAC GCTGTGATCT ACGGGCTCCT CAATGAGAAT 900 
TTCCGAAGAG AATACTGGAC CATCTTCCAT GCTATGCGGC ACCCTATCAT ATTCTTCTCT 960 
GGCCTCATCA GTGATATTCG TGAGATGCAG GAGGCCCGTA CCCTGGCCCG CGCCCGTGCC 1020 
CATGCTCGCG ACCAAGCTCG TGAACAAGAC CGTGCCCATG CCTGTCCTCC TGTGGAGGAA 1080 
ACCCCGATGA ATGTCCGGAA TGTTCCATTA CCTGGTGATG CTGCAGCTGG CCACCCCGAC 1140 . 
CQTGCCTCT6 6CCACCCTAA GCCCCATTCC AGATCCTCCT CTGCCTATCG CAAATCTGCC 1200 
TCTACCCACC ACAAGTCTGT CTTTAGCCAC TCCAAGGCTG CCTCTGGTCA CCTCAA6CCT 1260 
GTCTCTGGCC ACTCCAAGCC TGCCTCTGGT CACCCCAAGT CTCCCACTGT CTACCCTAAG 1320 
CCTQCCTCTG TCCATTTCAA 6GCTGACTCT GTCCATTTCA AGGGTGACTC TGTCCATTTC 1380 
AAGCCTGACT CTGTTCATTT CAAGCCTGCT TCCAGGAACC CCAAGCCCAT CACTGGCCAC 1440 
CATGTCTCTG CTGGCAGCCA CTCCAAGTCT GCCTTCAGTG CTCCCACCAG CCACCCTAAA 1500 
CCCACCACTG GCCACATCAA GCCAGCTACC AGCCATGCTC AGCCCACCAC TGCTGACTAT 1560 
CCCAAGCCTG CCACTACCAG CCACCCTAAG CCCACTGCTG CTGACAACCC TGAGCTCTCT 1620 
GCCTCCCATT GCCCCGAGAT CCCTGCCATT GCCCACCCTG TGTCTGACGA CAGTGACCTC 1680 
CCTGAGTCGG CCTCTAGCCC TGCCGCTGGG CCCACCAAGC CTGCTGCCAG CCAGCTGGAG 1740 
TCTGACACCA TCGCTGACCT TCCTGACCCT ACTGTAGTCA CTACCA6TAC CAATGATTAC 1800 
CATGATGTCG TGGTTGTTGA TGTTGAAGAT GATCCTGATG AAATGGCTGT GTQA 1854 
(219) INFORMATION FOR SEQ ID N0:218: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 617 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:218: 

Met Gly Pro Thr Leu Ala Val Pro Thr Pro Tyr Gly Cys lie Gly Cys 
15 10 15 

Lys Leu Pro Gin Pro Glu Tyr Pro Pro Ala Leu lie lie Phe Met Phe 
20 25 30 

Cys Ala Met Val He Thr He Val Val Asp Leu He Gly Asn Ser Met 
35 40 45 

Val He Leu Ala Val Thr Lys Asn Lys Lys Leu Arg Asn Ser Gly Asn 
50 55 60 

He Phe Val Val Ser Leu Ser Val Ala Asp Met Leu Val Ala He Tyr 
65 70 75 . 80 

Pro Tyr Pro Leu Met Leu His Ala Met Ser He Gly Gly Trp Asp Leu 
85 90 95 

Ser Gin Leu Gin Cys Gin Met Val Gly Phe He Thr Gly Leu Ser Val 
100 105 110 

Val Gly Ser He Phe Asn He Val Ala He Ala He Asn Arg Tyr Cys 
115 120 125 

Tyr He Cys His Ser Leu Gin Tyr Glu Arg He Phe Ser Val Arg Asn 
130 135 140 

Thr Cys He Tyr Leu Val He Thr Trp He Met Thr Val Leu Ala Val 
145 150 . 155 160 

Leu Pro Asn Met Tyr lie Gly Thr He Glu Tyr Asp Pro Arg Thr Tyr 
165 170 175 

Thr Cys He Phe Asn Tyr Leu Asn Asn Pro Val Phe Thr Val Thr He 
180 185 190 

Val Cys He His Phe Val Leu Pro Leu Leu He Val Gly Phe Cys Tyr 
195 200 205 

Val Arg He Trp Thr Lys Val Leu Ala Ala Arg Asp Pro Ala Gly Gin 
210 215 220 

Asn Pro Asp Asn Gin Leu Ala Glu Val Arg Asn Lys Leu Thr Met Phe 
225 230 235 240 

Val He Phe Leu Leu Phe Ala Val Cys Trp Cys Pro He Asn Val Leu . 

245 250 255 
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30 



35 
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Thr Val Leu Val Ala Val Ser Pro Lys Glu Met Ala Gly Lys lie Pro 
260 265 270 

Asn Trp Leu Tyr Leu Ala Ala Tyr Phe He Ala Tyr Phe Asn Ser Cys 
275 280 285 

Leu Asn Ala Val He Tyr Gly Leu Leu Asn Glu Asn Phe Arg Arg Glu 
290 295 300 

Tyr Trp Thr He Phe His Ala Met Arg His Pro He He Phe Phe Ser 
305 310 315 320 

Gly Leu He Ser Asp He Arg Glu Met Gin Glu Ala Arg Thr Leu Ala 
325 330 335 

Arg Ala Arg Ala His Ala Arg Asp Gin Ala Arg Glu Gin Asp Arg Ala 
340 345 350 

His Ala Cys Pro Ala Val Glu Glu Thr Pro Met Asn Val Arg Asn Val 
355 360 365 

Pro Leu Pro Gly Asp Ala Ala Ala Gly His Pro Asp Arg Ala Ser Gly 
370 375 380 

His Pro Lys Pro His Ser Arg Ser Ser Ser Ala Tyr Arg Lys Ser Ala 
385 390 395 400 

Ser Thr His His Lys Ser Val Phe Ser His Ser Lys Ala Ala Ser Gly 
405 410 415 

His Leu Lys Pro Val Ser Gly His Ser Lys Pro Ala Ser Gly His Pro 
420 425 . 430 

Lys Ser Ala Thr Val Tyr Pro Lys Pro Ala Ser Val His Phe Lys Ala 
435 440 445 

Asp Ser Val His Phe Lys Gly Asp Ser Val His Phe Lys Pro Asp Ser 
450 .455 460 

Val His Phe Lys Pro Ala Ser Ser Asn Pro Lys Pro He Thr Gly His 
465. 470 475 480 

His Val Ser Ala Gly Ser His Ser Lys Ser Ala Phe Ser Ala Ala Thr 
485 490 495 

Ser His Pro Lys Pro Thr Thr Gly His He Lys Pro Ala Thr Ser His 
500 505 510 

Ala Glu Pro Thr Thr Ala Asp Tyr Pro Lys Pro Ala Thr Thr Ser His 
515 520 525 

Pro Lys Pro Thr Ala Ala Asp Asn Pro Glu Leu Ser Ala Ser His Cys 
530 535 540 



Pro Glu He Pro Ala He Ala His Pro Val Ser Asp Asp Ser Asp Leu 
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545 550 555 560 

Pro Glu Ser Ala Ser Ser Pro Ala Ala Gly Pro Thr Lys Pro Ala Ala 
565 570 575 

Ser Gin Leu Glu Ser Asp Thr He Ala Asp Leu Pro Asp Pro Thr Val 
580 585 590 

Val Thr Thr Ser Thr Asn Asp Tyr His Asp Val Val Val Val Asp Val 
595 600 605 

Glu Asp Asp Pro Asp Glu Met Ala Val 
610 615 

(220) INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1548 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 

ATGGGACATA ACGGGAGCTG GATCTCTCCA AATGCCAGCG AGCCGCACAA CGCGTCCGGC 60 

GCCGAGGCTG CGGGTGTGAA CCGCAGCGCG CTCGGGGAGT TCGGCGAGGC GCAGCTGTAC 120 

CGCCAGTTCA CCACCACCGT GCAGGTCGTC ATCTTCATAG GCTCGCTGCT CGGAAACTTC 180 

ATGGTGTTAT GGTCAACTTG CCGCACAACC GTGTTCAAAT CTGTCACCAA CAGGTTCATT 240 

AAAAACCTGG CCTGCTCGGG GATTTGTGCC AGCCTGGTCT GTGTGCCCTT CGACATCATC 300 

CTCAGCACCA GTCCTCACTG TTGCTGGTGG ATCTACACCA TGCTCTTCTG CAAGGTCGTC 360 

AAATTTTTGC ACAAAGTATT CTGCTCTGTG ACCATCCTCA GCTTCCCTGC TATTGCTTTG 420 

GACAGGTACT ACTCAGTCCT CTATCCACTG GAGAGGAAAA TATCTGATGC CAAGTCCCGT 480 

GAACTGGTGA TGTACATCTG GGCCCATGCA GTGGTGGCCA GTGTCCCTGT GTTTGCAGTA 540 

ACCAATGTGG CTGACATCTA TGCCACGTCC ACCTGCACGG AAGTCTGGAG CAACTCCTTG 600 

GGCCACCTGG TGTACGTTCT GGTGTATAAC ATCACCACGG TCATTGTGCC TGTGGTGGTG 660 

GTGTTCCTCT TCTTGATACT GATCCGACGG GCCCTGAGTG CCAGCCAGAA GAAGAAGGTC 720 

ATCATAGCAG CGCTCCGGAC CCCACAGAAC ACCATCTCTA TTCCCTATGC CTCCCAGCGG 780 

GAGGCCGAGC T6AAAGCCAC CCTGCTCTCC ATGGTGATGG TCTTCATCTT GTGTAGCGTG 840 

CCCTATGCCA CCCTGGTCGT CTACCAGACT GTGCTCAATG TCCCTGACAC TTCCGTCTTC 900 
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TTGCTGCTCA CTGCTGTTTG GCTGCCCAAA GTCTCCCTCC TGGCAAACCC TGTTCTCTTT 960 
CTTACTGTGA ACAAATCTGT CCGCAAGTGC TTGATAGGGA CCCTGGTGCA ACTACACCAC 1020 
CGGTACAGTC GCCGTAATGT GGTCAGTACA GGGAGTGGCA TGGCTGAGGC CAGCCTGGAA 1080 
CCCAGCATAC GCTCGGGTAG CCAGCTCCTG GAGATGTTCC ACATTCGGCA GCAGCAGATC 1140 
TTTAAGCCCA CAGAGGATGA GGAAGAfiAGT GAGGCCAAGT ACATTCGCTC AGCTGACTTC 1200 
CAGGCCAAGG AGATATTTA6 CACCTGCCTQ GAGGGAGAGC AG6GGCCACA GTTTGCGCCC 1260 
TCTGCCCCAC CCCTGAGCAC AGTGGACTCT GTATCCCAGG TGGCACCGGC AGCCCCTGTG 1320 
GAACCTGAAA CATTCCCTGA TAAGTATTCC CTGCAGTTTG GCTTTGGGCC TTTTGAGTTG 1380 
CCTCCTCAGT GGCTCTCA6A GACCCGAAAC AGCAAGAAGC GGCTGCTTCC CCCCTTGGGC 1440 
AACACCCCAG AAGAGCTGAT CCAQACAAAG GTGCCCAAGG TAGGCAGGGT GGAGCGGAAG 1500 
ATGAGCAGAA ACAATAAAGT GAGCATTTTT CCAAAGGTGG ATTCCTAG 1543 
(221) INFORMATION FOR SEQ ID NO: 220: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 515 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 

Met Gly His Asn Gly Ser Trp He Ser Pro Asn Ala Ser Glu Pro His 
^. = 10 15 

Asn Ala ser Gly Ala Glu Ala Ala Gly Val Asn Arg Ser Ala Leu Gly 
2° 25 30 

Glu Phe Gly Glu Ala Gin Leu Tyr Arg Gin Phe Thr Thr Thr Val Gin 
35 40 45 

Val Val He Phe He Gly Ser Leu Leu Gly Asn Phe Met Val Leu Tru 
50 55 60 

Ser Thr Cys Arg Thr Thr Val Phe Lys Ser Val Thr Afen Arg Phe He 
" "^0 75 80 

Lys Asn Leu Ala Cys Ser Gly He Cys Ala Ser Leu Val Cys Val Pro 
85 90 95 

Phe Asp He He Leu Ser Thr Ser Pro His Cys Cys Trp Trp He Tyr 
100 105 110 



184 



Thr Met Leu Phe Cys Lys Val Val Lys Phe Leu His Lys val Phe Cys 

115 120 125 

Ser Val Thr lie Leu Ser Phe Pro Ala lie Ala Leu Asp Arg Tyr Tyr 

130 135 140 

Ser Val Leu Tyr Pro Leu Glu Arg Lys lie Ser Asp Ala Lys Ser Arg 

145 150 155 160 



Glu Leu Val Met Tyr lie Trp Ala His Ala Val Val Ala Ser Val Pro 
165 170 175 

Val Phe Ala Val Thr Asn Val Ala Asp lie Tyr Ala Thr Ser Thr Cys 
180 185 190 



Thr Glu Val Trp Ser Asn Ser Leu Gly His Leu Val Tyr Val Leu Val 
195 200 205 

Tyr Asn He Thr Thr Val He Val Pro Val Val Val Val Phe Leu Phe 
210 215 220 

Leu He Leu He Arg Arg Ala Leu Ser Ala Ser Gin Lys Lys Lys Val 
225 230 235 240 

He He Ala Ala Leu Arg Thr Pro Gin Asn Thr He Ser He Pro Tyr 
245 250 255 

Ala Ser Gin Arg Glu Ala Glu Leu Lys Ala Thr Leu Leu Ser Met Val 
260 265 270 

Met Val Phe He Leu Cys Ser Val Pro Tyr Ala Thr Leu Val Val Tyr 
275 280 285 

Gin Thr Val Leu Asn Val Pro Asp Thr Ser Val Phe Leu Leu Leu Thr 
290 295 300 

Ala Val Trp Leu Pro Lys Val Ser Leu Leu Ala Asn Pro Val Leu Phe 
305 310 315 320 

Leu Thr Val Asn Lys Ser Val Arg Lys Cys Leu He Gly Thr Leu Val 
325 330 -335 



Gin Leu His His Arg Tyr Ser Arg Arg Asn Val Val Ser Thr Gly Ser 

340 345 350 

Gly Met Ala Glu Ala Ser Leu Glu Pro Ser He Arg Ser Gly Ser Gin 

355 360 365 

Leu Leu Glu Met Phe His He Gly Gin Gin Gin He Phe Lys Pro Thr 
370 375 380 



Glu Asp Glu Glu Glu Ser Glu Ala Lys Tyr He Gly Ser Ala Asp Phe 
385 390 395 400 

Gin Ala Lys Glu He Phe Ser Thr Cys Leu Glu Gly Glu Gin Gly Pro 
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405 410 415 

Gin Phe Ala Pro Ser Ala Pro Pro Leu Ser Thr Val Asp Ser Val Ser 
420 425 430 

Gin Val Ala Pro Ala Ala Pro Val Glu Pro Glu Thr Phe Pro Asp Lys 
435 440 445 

Tyr Ser Leu Gin Phe Gly Phe Gly Pro Phe Glu Leu Pro Pro Gin Trp 
450 455 460 

Leu Ser Glu Thr Arg Asn Ser Lys Lys Arg Leu Leu Pro Pro Leu Gly 
465 470 475 480 

Asn Thr Pro Glu Glu Leu He Gin Thr Lys Val Pro Lys Val Gly Arg 
485 490 495 

Val Glu Arg Lys Met Ser Arg Asn Asn Lys Val Ser He Phe Pro Lys 
500 505 510 

Val Asp Ser 
515 

(222) INFORMATION FOR SEQ ID NO : 221: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1164 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

*(xi) SEQUENCE DESCRIPTION: SEQ ID NO:221: 

ATGAATCGGC ACCATCTGCA GGATCACTTT CTGGAAATAG ACAAGAAGAA CTGCTGTGTG 60 

TTCCGAGATG ACTTCATTGC CAAGGTGTTG CCGCCGGTGT TGGGGCTGGA. GTTTATCTTT 120 

GGGCTTCTGG GCAATGGCCT TGCCCTGTGG ATTTTCTGTT TCCACCTCAA GTCCTGGAAA 180 

TCCAGCCGGA TTTTCCTGTT CAACCTGGCA GTAGCTGACT TTCTACTGAT CATCTGCCTG 240 

CCGTTCGTGA TGGACTACTA TGTGCGGCGT TCAGACTGGA AGTTTGGGGA CATCCCTTGC 300 

CGGCTGGTGC TCTTCATGTT TGCCATGAAC CGCCAGGGCA GCATCATCTT CCTCACGGTG 360 

GTGGCGGTAG ACAGGTATTT CCGGGTGGTC CATCCCCACC ACGCCCTGAA CAAGATCTCC 420 

AATTGGACAG CAGCCATCAT CTCTTGCCTT CTGTGGGGCA TCACTGTTGG CCTAACAGTC 480 

CACCTCCTGA AGAAQAAGTT GOTGATCCAG AATGGCCCTG CAAATGTGTG CATCAGCTTC 540 

AGCATCTGCC ATACCTTCCG GTGGCACGAA GCTATGTTCC TCCTGGAGTT CCTCCTGCCC 600 
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CTGGGCATCA TCCTGTTCTG CTCAGCCAGA ATTATCTGGA GCCTGCGGCA GAGACAAATG 660 

GACCGGCATG . CCAAGATCAA GAGAGCCAAA ACCTTCATCA TGGTGGTGGC CATCGTCTTT 720 

GTCATCTGCT TCCTTCCCAG CGTGGTTGTG CGGATCCGCA TCTTCTGGCT CCTGCACACT 780 

TCGGGCACGC AGAATTGTGA AGTGTACCGC TCGGTGGACC TGGCGTTCTT TATCACTCTC 840 

AGCTTCACCT ACATGAACAG CATGCTGGAC CCCGTGGTGT ACTACTTCTC CAGCCCATCC 900 

TTTCCCAACT TCTTCTCCAC TTTGATCAAC CGCTGCCTCC AGAGGAAGAT GACAGGTGAG 960 

CCAGATAATA ACCGCAGCAC GAGCGTCGAG CTCACAGGGG ACCCCAACAA AACCAGAGGC 1020 

GCTCCAGAGG CGTTAATGGC CAACTCCGGT GAGCCATGGA GCCCCTCTTA TCTGGGCCCA 1080 

ACCTCAAATA ACCATTCCAA GAAGGGACAT TGTCACCAAG AACCAGCATC TCTGGAGAAA 1140 

CAGTTGGGCT GTTGCATCGA GTAA 11^4 
(223) INFORMATION FOR SEQ ID NO:222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222: 

Met Asn Arg His His Leu Gin Asp His Phe Leu Glu He Asp Lys Lys 
15 10 15 

Asn Cys Cys Val Phe Arg Asp Asp Phe He Ala Lys Val Leu Pro Pro 
20 25 30 

Val Leu Gly Leu Glu Phe He Phe Gly Leu Leu Gly Asn Gly Leu Ala 
35 40 45 

Leu Trp He Phe Cys Phe His Leu Lys Ser Trp Lys Ser Ser Arg He 
50 55 60 

Phe Leu Phe Asn Leu Ala Val Ala Asp Phe Leu Leu He He Cys Leu 
^5 70 75 80 

Pro Phe Val Met Asp Tyr Tyr Val Arg Arg Ser Asp Trp Lys Phe Gly 
85 90 95 

Asp He Pro Cys Arg Leu Val Leu Phe Met Phe Ala Met Asn Arg Gin 
100 105 110 

Gly Ser He He Phe Leu Thr Val Val Ala Val Asp Arg Tyr Phe Arg 
115 120 125 
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Val Val His Pro His His Ala Leu Asn Lys lie Ser Asn Trp Thr Ala 
130 135 140 

Ala lie lie Ser Cys Leu Leu Trp Gly He Thr Val Gly Leu Thr Val 

Z50 155 160 

His Leu Leu Lys Lys Lys Leu Leu He Gin Asn Gly Pro Ala Asn Val 
165 170 

Cys He Ser Phe Ser He Cys His Thr Phe Arg Trp His Glu Ala Met 
180 185 190 

Phe Leu Leu Glu Phe Leu Leu Pro Leu Gly He He Leu Phe Cys Ser 
195 200 205 

Ala Arg He He Trp Ser Leu Arg Gin Arg Gin Mef Asp Arg His Ala 
2X0 215 220 

Lys He Lys Arg Ala Lys Thr Phe He Met Val Val Ala He Val Phe 
225 230 235 240 

val He Cys Phe Leu Pro Ser Val Val Val Arg He Arg He Phe Trp 
245 250 255 

Leu Leu His Thr Ser Gly Thr Gin Asn Cys Glu Val Tyr Arg Ser Val 
260 265 270 

Asp Leu Ala Phe Phe He Thr Leu Ser Phe Thr Tyr Met Asn Ser Met 
275 .280 285 

Leu Asp Pro Val Val Tyr Tyr Phe Ser Ser Pro Ser Phe Pro Asn Phe 
290 295 300 

Phe Ser Thr Leu He Asn Arg Cys Leu Gin Arg Lys Met Thr Gly Glu 

310 315 320 

Pro Asp Asn Asn Arg Ser Thr Ser Val Glu Leu Thr Gly Asp Pro' Asn 
325 330 335 

Lys Thr Arg Gly Ala Pro Glu Ala Leu Met Ala Asn Ser Gly Glu Pro 
340 345 

Trp Ser Pro Ser Tyr Leu Gly Pro Thr Ser Asn Asn His Ser Lys Lys 
355 360 365 

Gly His Cys His Gin Glu Pro Ala Ser Leu Glu Lys Gin Leu Gly Cys 
370 375 380 

Cys He Glu 
385 

(224) INFORMATION FOR SEQ ID NO: 223: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1212 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: not relevant 

(ii) MOLECOTiE TYPE: DNA (genomic) 

(xi) SEQtlENCE DESCRIPTION: SEQ ID NO:223: 

ATGGCTTGCA ATGGCAGTGC GGCCAGGGGG CACTTTGACC CTGAGGACTT GAACCTGACT 60 

GACGAGGCAC TGAGACTCAA GTACCTGGGG CCCCAGCAGA CAGAGCTGTT CATGCCCATC 120 

TGTGCCACAT ACCTGCTGAT CTTCGTGGTG GGCGCTGTGG GCAATGGGCT GACCTGTCTG 180 

GTCATCCTGC GCCACAAGGC CATGCGCACG CCTACCAACT ACTACCTCTT CAGCCTGGCC 240 

GTGTCGGACC TGCTGGTGCT GCTGGTGGGC CTGCCCCTGG AGCTCTATGA GATGTGGCAC 300 

AACTACCCCT TCCTGCTGGG CGTTGGTGGC TGCTATTTCC GCACGCTACT GTTTGAGATG 360 

GTCTGCCTGG CCTCAGTGCT CAACGTCACT GCCCTGAGCG TGGAACGCTA TGTGGCCGTG 420 

GTGCACCCAC TCCAGGCCAG GTCCATGGTG ACGCGGGCCC ATGTGCGCCG AGTGCTTGGG 480 

GCCGTCTGGG GTCTTGCCAT GCTCTGCTCC CTGCCCAACA CCAGCCTGCA CGGCATCCGG 540 

CAGCTGCACG TGCCCTGCCG GGGCCCAGTG CCAGACTCAG CTGTTTGCAT GCTGGTCCGC 600 

CCACGGGCCC TCTACAACAT GGTAGTGCAG ACCACCGCGC TGCTCTTCTT CTGCCTGCCC 660 

ATGGCCATCA TGAGCGTGCT CTACCTGCTC ATTGGGCTGC GACTGCGGCG GGAGAGGCTG 720 

CTGCTCATGC AGGAGGCCAA GGGCAGGGGC TCTGCAGCAG CCAGGTCCAG ATACACCTGC 780 

AGGCTCCAGC AGCACGATCG GGGCCGGAGA CAAGTGAAGA AGATGCTGTT TGTCCTGGTC 840 

GTGGTGTTTG GCATCTGCTG GGCCCCGTTC CACGCCGACC GCGTCATGTG GAGCGTCGTG 900 

TCACAGTGGA CAGATGGCCT GCACCTGGCC TTCCAGCACG TGCT^CGTCAT CTCCGGCATC 960 

TTCTTCTACC TGGGCTCGGC GGCCAACCCC GTGCTCTATA GCCTCATGTC CAGCCGCTTC 1020 

CGAGAGACCT TCCAGGAGGC CCTGTGCCTC GGGGCCTGCT GCCATCGCCT CAGACCCCGC 1080 

CACAGCTCCC ACAGCCTCAG CAGGATGACC ACAGGCAGCA CCCTGTGTGA TGTGGGCTCC 1140 

CTGGGCAGCT GGGTCCACCC CCTGGCTGGG AACGATGGCC CAGAGGCGCA GCAAGAGACC 1200 

GATCCATCCT GA 1212 
(225) INFORMATION FOR SEQ ID NO: 224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 403 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 

Met Ala Cys Asn Gly Ser Ala Ala Arg Gly His Phe Asp Pro Glu Asp 
1 5 .10 15 

Leu Asn Leu Thr Asp Glu Ala Leu Arg Leu Lys Tyr Leu Gly Pro Gin 
20 25 30 

Gin Thr Glu Leu Phe Met Pro lie Cys Ala Thr Tyr Leu Leu lie Phe 
35 40 45 

Val Val Gly Ala Val Gly Asn Gly Leu Thr Cys Leu Val lie Leu Arg 
50 55 60 

His Lys Ala Met Arg Thr Pro Thr Asn Tyr Tyr Leu Phe Ser Leu Ala 
65 70 75 80 

Val Ser Asp Leu Leu Val Leu Leu Val Gly Leu Pro Leu Glu Leu Tyr 
85 90 95 

Glu Met Trp His Asn Tyr Pro Phe Leu Leu Gly Val Gly Gly Cys Tyr 
100 105 110 

Phe Arg Thr Leu Leu Phe Glu Met Val Cys Leu Ala Ser Val Leu Asn 
115 120 125 

Val Thr Ala Leu Ser Val Glu Arg Tyr Val Ala Val Val His Pro Leu 
130 135 140 

Gin Ala Arg Ser Met Val Thr Arg Ala His Val Arg Arg Val Leu Gly 
145 150 155 160 

Ala Val Trp Gly Leu Ala Met Leu Cys Ser Leu Pro Asn Thr Ser Leu 
165 170 175 

His Gly He Arg Gin Leu His Val Pro Cys Arg Gly Pro Val Pro Asp 
180 185 190 

Ser Ala Val Cys Met Leu Val Arg Pro Arg Ala Leu Tyr Asn Met Val 
195 200 205 

Val Gin Thr Thr Ala Leu Leu Phe Phe Cys Leu Pro Met Ala He Met 
210 215 220 

Ser Val Leu Tyr Leu Leu He Gly Leu Arg Leu Arg Arg Glu Arg Leu 
225 230 235 ' 240 

Leu Leu Met Gin Glu Ala Lys Gly Arg Gly Ser Ala Ala Ala Arg Ser 
245 250 255 
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Arg Tyr Thr Cys Arg Leu Gin Gin His Asp Arg Gly Arg Arg Gin Val 
260 265 270 

Lys Lys Met Leu Phe Val Leu Val Val Val Phe Gly lie Cys Trp Ala 
275 280 285 

Pro Phe His Ala Asp Arg Val Met Trp Ser Val Val Ser Gin Trp Thr 
290 295 300 

Asp Gly Leu His Leu Ala Phe Gin His Val His Val He Ser Gly He 
305 310 315 320 

Phe Phe Tyr Leu Gly Ser Ala Ala Asn Pro Val Leu Tyr Ser Leu Met 
325 330 335 

Ser Ser Arg Phe Arg Glu Thr Phe Gin Glu Ala Leu Cys Leu Gly Ala 
340 345 350 

Cys Cys His Arg Leu Arg Pro Arg His Ser Ser His Ser Leu Ser Arg 
355 360 365 

Met Thr Thr Gly Ser Thr Leu Cys Asp Val Gly Ser Leu Gly Ser Trp 
370 375 380 

Val His Pro Leu Ala Gly Asn Asp Gly Pro Glu Ala Gin Gin Glu Thr 
385 390 395 400 

Asp Pro Ser 



(226) INFORMATION FOR SEQ ID NO: 225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1098 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:225: 

ATGGGGAACA TCACTGCAGA CAACTCCTCG ATGAGCTGTA CCATCGACCA TACCATCCAC 60 

CAGACGCTGG CCCCGGTGGT CTATGTTACC GTGCTGGTGG TGGGCTTCCC GGCCAACTGC 120 

CTGTCCCTCT. ACTTCGGCTA CCTGCAGATC AAGGCCCGGA ACGAGCTGGG CGTGTACCTG 180 

TGCAACCTGA CGGTGGCCGA CCTCTTCTAC ATCTGCTCGC TGCCCTTCTG GCTGCAGTAC 240 

GTGCTGCAGC ACGACAACTG GTCTCACGGC GACCTGTCCT GCCAGGTGTG CGGCATCCTC 300 

CTGTACGAGA ACATCTACAT CAGCGTGGGC TTCCTCTGCT GCATCTCCGT GGACCGCTAC 360 

CTGGCTGTGG CCCATCCCTT CCGCTTCCAC CAGTTCCGGA CCCTGAAGGC GGCCGTCGGC 420. 
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GTCAGCGTGG TCATCTGGGC CAAGGAGCTG CTGACCAGCA TCTACTTCCT GATGCACGAG 480 

GAGGTCATCG AGGACGAGAA CCAGCACCGC GTGTGCTTTG AGCT^CTACCC CATCCAGGCA 540 

TGGCAGCGCG CCATCAACTA CTACCGCTTC CTGGTGGGCT TCCTCTTCCC CATCTGCCTG 600 

CTGCTGGCGT CCTACCAGGG CATCCTGCGC GCCGTGCGCC GGAGCCACGG CACCCAGAAG 660 

AGCCGCAAGG ACCAGATCAA GCGGCTGGTG CTCAGCACCG TGGTCATCTT CCTGGCCTGC 720 

TTCCTGCCCT ACCACGTGTT GCTGCTGGTG CGCAGCGTCT GGGAGGCCAG CTGCGACTTC 780 

GCCAAGGGCG TTTTCAACGC CTACCACTTC TCCCTCCTGC TCACCAGCTT CAACTGCGTC 840 

GCCGACCCCG TGCTCTACTG CTTCGTCAGC GAGACCACCC ACCGGGACCT GGCCCGCCTC 900 

CGCGGGGCCT GCCTGGCCTT CCTCACCTGC TCCAGGACCG GCCGGGCCAG GGAGGCCTAC 960 

CCGCTGGGTG CCCCCGAGGC CTCCGGGAAA AGCGGGGCCC AGGGTGAGGA GCCCGAGCTG 1020 

TTGACCAAGC TCCACCCGGC CTTCCAGACC CCTAACTCGC CAGGGTCGGG CGGGTTCCCC 1080 

ACGGGCAGGT TGGCCTAG 1098 
(227) INFORMATION FOR SEQ ID NO:226: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 365 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:226: 

Met Gly Asn lie Thr Ala Asp Asn Ser Ser Met Ser Cys Thr lie Asp 
1 . 5 10 15 

His Thr lie His Gin Thr Leu Ala Pro Val Val Tyr Val Thr Val Leu 
20 25 30 

Val Val Gly Phe Pro Ala Asn Cys Leu Ser Leu Tyr Phe Gly Tyr Leu 
35 40 45 

Gin lie Lys Ala Arg Asn Glu Leu Gly Val Tyr Leu Cys Asn Leu Thr 
50 55 60 

Val Ala Asp Leu Phe Tyr lie Cys Ser Leu Pro Phe Trp Leu Gin Tyr 
65 70 75 80 

Val Leu Gin His Asp Asn Trp Ser His Gly Asp Leu Ser Cys Gin Val 
85 90 95 

Cys Gly He Leu Leu Tyr Glu Asn He Tyr He Ser Val Gly Phe Leu 
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100 105 110 

Cys Cys lie Ser Val Asp Arg Tyr Leu Ala Val Ala His Pro Phe Arg 
115 120 125 

Phe His Gin Phe Arg Thr Leu Lys Ala Ala Val Gly Val Ser Val Val 
5 130 135 140 

lie Trp Ala Lys Glu Leu Leu Thr Ser lie Tyr Phe Leu Met His Glu 
145 150 155 160 

Glu Val lie Glu Asp Glu Asn Gin His Arg Val Cys Phe Glu His Tyr 
165 170 175 

10 Pro lie Gin Ala Trp Gin Arg Ala lie Asn Tyr Tyr Arg Phe Leu Val 

180 185 190 

Gly Phe Leu Phe Pro lie Cys Leu Leu Leu Ala Ser Tyr Gin Gly lie 
195 200 205 

Leu Arg Ala Val Arg Arg Ser His Gly Thr Gin Lys Ser Arg Lys Asp 
15 210 215 220 

Gin lie Lys Arg Leu Val Leu Ser Thr Val Val lie Phe Leu Ala Cys 
225 230 235 240 

Phe Leu Pro Tyr His Val Leu Leu Leu Val Arg Ser Val Trp Glu Ala 
245 250 255 

20 Ser Cys Asp Phe Ala Lys Gly Val Phe Asn Ala Tyr His Phe Ser Leu 

260 265 270 

Leu Leu Thr Ser Phe Asn Cys Val Ala Asp Pro Val Leu Tyr Cys Phe 
275 280 285 

Val Ser Glu Thr Thr His Arg Asp Leu Ala Arg Leu Arg Gly Ala Cys 
25 290 295 300 

Leu Ala Phe Leu Thr Cys Ser Arg Thr Gly Arg Ala Arg Glu Ala Tyr 
305 310 315 320 

Pro lieu Gly Ala Pro Glu Ala Ser Gly Lys Ser Gly Ala Gin Gly Glu 
325 330 335 

30 Glu Pro Glu Leu Leu Thr Lys Leu His Pro Ala Phe Gin Thr Pro Asn 

340 345 350 

Ser Pro Gly Ser Gly Gly Phe Pro Thr Gly Arg Leu Ala 
355 360 365 

(228) INFORMATION FOR SEQ ID N0:227: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1416 base pairs 

(B) TYPE: nucleic acid 
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(C) STRAITOEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 7: 
5 ATGGATATTC TTTGTGAAGA AAATACTTCT TTGAGCTCAA CTACGAACTC CCTAATGCAA 60 
TTAAATGATG ACAACAGGCT CTACAGTAAT GACTTTAACT CCGGAGAAGC TAACACTTCT 120 
GATGCATTTA ACTGGACAGT CGACTCTGAA AATCGAACCA ACCTTTCCTG TGAAGGQTGC 180 
CTCTCACCGT CGTGTCTCTC CTTACTTCAT CTCCAGGAAA AAAACTGGTC TGCTTTACTG 240 
ACAGCCGTAG TGATTATTCT AACTATTGCT GGAAACATAC TCGTCATCAT GGCAGTGTCC 300 
10 CTAGAGAAAA AGCTGCAGAA TGCCACCAAC TATTTCCTGA TGTCACTTGC CATAGCTGAT 360 
ATGCTGCTGG GTTTCCTTGT CATGCCCGTG TCCATGTTAA CCATCCTGTA TGGGTACCGG 420 
TGGCCTCTGC CGAGCAAGCT TTGTGCAGTC TGGATTTACC TGGACGTGCT CTTCTCCACG 480 
GCCTCCATCA TGCACCTCTG ■ CGCCATCTCG CTGGACCGCT ACGTCGCCAT CCAGAATCCC 540 
ATCCACCACA GCCGCTTCAA CTCCAGAACT AAGGCATTTC TGAAAATCAT TGCTGTTTGG 600 
ACCATATCAG TAGGTATATC CATGCCAATA CCAGTCTTTG GGCTACAGGA CGATTCGAAG 660 
GTCTTTAAGG AGGG6AGTTG CTTACTCGCC GATGATAACT TTGTCCTGAT CGGCTCTTTT 720 
GTGTCATTTT TCATTCCCTT AACCATCATG GTGATCACCT ACTTTCTAAC TATCAAGTCA 780 
CTCCAGAAAG AAGCTACTTT GTGTGTAAGT GATCTTGGCA CACGGGCCAA ATTAGCTTCT 840 
TTCAGCTTCC TCCCTCAGAG TTCTTTGTCT TCAGAAAAGC TCTTCCAGCG GTCGATCCAT 900 
AGGGAGCCAG GGTCCTACAC AGGCAGGAQG ACTATGCAGT CCATCAGCAA TGAGCAAAAG 960 
GCAAAGAAGG TGCTGGGCAT CGTCTTCTTd CTGTTTGTGG TGATGTGGTG CCCTTTCTTC 1020 
ATCACAAACA TCATGGCCGT CATCTGCAAA GAGTCCTGCA ATGAGGATGT CATTGGGGCC 1080 
CTGCTCAATG TGTTTGTTTG GATCGGTTAT CTCTCTTCAG CAGTCAACCC ACTAGTCTAC 1140 
ACACT6TTCA ACAAGACCTA TAGGTCAGCC TTTTCACGGT ATATTCAGTG TCAGTACAAG 1200 
GAAAACAAAA AACCATTGCA GTTAATTTTA GTGAACACAA TACCGGCTTT GGCCTACAAG 1260 
TCTAGCCAAC TTCAAATGGG ACAAAAAAAG AATTCAAAGC AAGATGCCAA GACAACAQAT 1320 
AATGACTGCT CAATGGTTGC TCTAGGAAAG CAGTATTCTC AAGAGGCTTC TAAAGACAAT 1380 
AGCGACGGAG TGAATGAAAA GGTGAGCTGT GTGTGA 

1416 
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(229) INFORMATION FOR SEQ ID NO:228: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not* relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:228: 

Met Asp lie Leu Cys Glu Glu Asn Thr Ser Leu Ser Ser Thr Thr Asn 
15 10 15 

Ser Leu Met Gin Leu Asn Asp Asp Asn Arg Leu Tyr Ser Asn Asp Phe 
20 25 30 

Asn Ser Gly Glu Ala Asn Thr Ser Asp Ala Phe Asn Trp Thr Val Asp 
35 40 45 

Ser Glu Asn Arg Thr Asn Leu Ser Cys Glu Gly Cys Leu Ser Pro Ser 
50 55 60 

Cys Leu Ser Leu Leu His Leu Gin Glu Lys Asn Trp Ser Ala Leu Leu 
65 70 75 80 

Thr Ala Val Val He He Leu Thr He Ala Gly Asn He Leu Val He 
85 90 95 

Met Ala Val Ser Leu Glu Lys Lys Leu Gin Asn Ala Thr Asn Tyr Phe 
100 105 110 

Leu Met Ser Leu Ala He Ala Asp Met Leu Leu Gly Phe Leu Val Met 
115 120 125 

Pro Val Ser Met Leu Thr He Leu Tyr Gly Tyr Arg Trp Pro Leu Pro 
130 135 140 

Ser Lys Leu Cys Ala Val Trp He Tyr Leu Asp Val Leu Phe Ser Thr 
145 150 155 160 

Ala Ser He Met His Leu Cys Ala He Ser Leu Asp Arg Tyr Val Ala 
165 170 175 

He Gin Asn Pro He His His Ser Arg Phe Asn Ser Arg Thr Lys Ala 
180 185 190 

Phe Leu Lys He He Ala Val Trp Thr He Ser Val Gly He Ser Met 
195 200 205 

Pro He Pro Val Phe Gly Leu Gin Asp Asp Ser Lys Val Phe Lys Glu 
210- 215 220 



Gly Ser Cys Leu Leu Ala Asp Asp Asn Phe Val Leu He Gly Ser Phe 
225 230 235 240 
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Val Ser Phe Phe He Pro Leu Thr He Met Val He Thr Tyr Phe Leu 
245 250 255 

Thr He Lys Ser Leu Gin Lys Glu Ala Thr Leu Cys Val Ser Asp Leu 
260 265 270 

Gly Thr Arg Ala Lys Leu Ala Ser Phe Ser Phe Leu Pro Gin Ser Ser 
275 280 285 

Leu Ser Ser Glu Lys Leu Phe Gin Arg Ser He His Arg Glu Pro Gly 
290 295 300 



Ser Tyr Thr Gly Arg Arg Thr Met Gin Ser He Ser Asn Glu Gin Lys 
305 310 315 320 

Ala Lys Lys Val Leu Gly He Val Phe Phe Leu Phe Val Val Met Trp 
325 330 335 

Cys Pro Phe Phe He Thr Asn He Met Ala Val He Cys Lys Glu Ser 
340 345 350 

Cys Asn Glu Asp Val He Gly Ala Leu Leu Asn Val Phe Val Trp He 
355 360 365 

Gly Tyr Leu Ser Ser Ala Val Asn Pro Leu Val Tyr Thr Leu Phe Asn 
370 375 380 

Lys Thr Tyr Arg Ser Ala Phe Ser Arg Tyr He Gin Cys Gin Tyr Lys 

390 395 ■ 400 

Glu Asn Lys Lys Pro Leu Gin Leu He Leu Val Asn Thr He Pro Ala 
405 410 415 

Leu Ala Tyr Lys Ser Ser Gin Leu Gin Met Gly Gin Lys Lys Asn Ser 
420 425 430 

Lys Gin Asp Ala Lys Thr Thr Asp Asn Asp Cys Ser Met Val Ala Leu 
435 440 445 

Gly Lys Gin Tyr Ser Glu Glu Ala Ser Lys Asp Asn Ser Asp Gly Val 
450 455 460 

Asn Glu Lys Val Ser Cys Val 
465 470 



(230) INFORMATION FOR SEQ ID NO:229: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1377 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 

ATGGTGAACC TGAGGAATGC GGTGCATTCA TTCCTTGTGC ACCTAATTGG CCTATTGGTT 60 

TGGCAATGTG ATATTTCTGT GAGCCCAGTA GCAGCTATAG TAACTGACAT TTTCAATACC 120 

TCCGATGGTG GACGCTTCAA ATTCCCAGAC GGGGTACAAA ACTGGCCAGC ACTTTCAATC 180 

GTCATCATAA TAATCATGAC AATAGGTGGC AACATCCTTG TGATCATGGC AGTAAGCATG 240 

GAAAAGAAAC TGCACAATGC CACCAATTAC TTCTTAATGT CCCTAGCCAT TGCTGATATG 300 

CTAGTGGGAC TACTTGTCAT GCCCCTGTCT CTCCTGGCAA TCCTTTATGA TTATGTCTGG 360 

CCACTACCTA GATATTTGTG CCCCGTCTGG ATTTCTTTAG ATGTTTTATT TTCAACAGCG 420 

TCCATCATGC ACCTCTGCGC TATATCGCTG GATCGGTATG TAGCAATACG TAATCCTATT 480 

GAGCATAGCC GTTTCAATTC GCGGACTAAG GCCATCATGA AGATTGCTAT TGTTTGGGCA 540 

ATTTCTATAG GTGTATCAGT TCCTATCCCT GTGATTGGAC TGAGGGACGA AGAAAAGGTG 600 

TTCGTGAACA ACACGACGTG CGTGCTCAAC GACCCAAATT TCGTTCTTAT TGGGTCCTTC 660 

GTAGCTTTCT TCATACCGCT GACGATTATG GTGATTACGT ATTGCCTGAC CATCTACGTT 720 

CTGCGCCGAC AAGCTTTGAT GTTACTGCAC GGCCACACCG AGGAACCGCC TGGACTAAGT 780 

CTGGATTTCC TGAAGTGCTG CT^GAGGAAT ACGGCCGAGG AAGAGAACTC TGCAAACCCT 84 0 

AACCAAGACC AGAACGCACG CCGAAGAAAG AAGAAGGAGA GACGTCCTAG GGGCACCATG 900 

CAGGCTATCA ACAATGAAAG AAAAGCTAAG AAAGTCCTTG GGATTGTTTT CTTTGTGTTT 960 

CTGATCATGT GGTGCCCATT TTTCATTACC T^TATTCTGT CTGTTCTTTG TGAGAAGTCC 1020 

TGTAACCAAA AGCTCATGGA AAAGCTTCTG AATGTGTTTG TTTGGATTGG CTATGTTTGT 1080 

TCAGGAATCA ATCCTCTGGT GTATACTCTG TTCAACAAAA TTTACCGAAG GGCATTCTCC 1140 

AACTATTTGC GTTGCAATTA TAAGGTAGAG AA7UUVGCCTC CTGTCAGGCA GATTCCAAGA 1200 

GTTGCCGCCA CTGCTTTGTC TGGGAGGGAG CTTAATGTTA ACATTTATCG GCATACCAAT 1260 

GAACCGGTGA TCGAGAAAGC CAGTGACAAT GAGCCCGGTA TAGAGATGCA AGTTGAGAAT 1320 

TTAGAGTTAC CAGTAAATCC CTCCAGTGTG GTTAGCGAAA GGATTAGCAG TGTGTGA 1377 
(231) INFORMATION FOR SEQ ID NO: 230: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 4 58 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 



wo 00/22129 



PCT/US99/23938 



197 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230.: 

Met Val Asn Leu Arg Asn Ala Val His Ser Phe Leu Val His Leu lie 
5 1 5 10 15 

Gly Leu Leu Val Trp Gin Cys Asp lie Ser Val Ser Pro Val Ala Ala 
20 25 .30 

lie Val Thr Asp lie Phe Asn Thr Ser Asp Gly Gly Arg Phe Lys Phe 
35 40 • 45 

10 Pro Asp Gly Val Gin Asn Trp Pro Ala Leu Ser lie Val lie lie lie 

50 55 60 

lie Met Thr lie Gly Gly Asn He Leu Val He Met Ala Val Ser Met 
65 70 75 80 

Glu Lys Lys Leu His Asn Ala Thr Asn Tyr Phe Leu Met Ser Leu Ala 
15 85 90 95 

He Ala Asp Met Leu Val Gly Leu Leu Val Met Pro Leu Ser Leu Leu 
100 105 110 

Ala He Leu Tyr Asp Tyr Val Trp Pro Leu Pro Arg Tyr Leu Cys Pro 
115 120 125 

20 Val Trp He Ser Leu Asp Val Leu Phe Ser Thr Ala Ser He Met His 

130 135 140 

Leu Cys Ala He Ser Leu Asp Arg Tyr Val Ala He Arg Asn Pro He 
145 150 155 160 

Glu His Ser Arg Phe Asn Ser Arg Thr Lys Ala He Met Lys He Ala 
25 165 170 175 

He Val Trp Ala He Ser He Gly Val Ser Val Pro He Pro Val He 
180 185 190 

Gly Leu Arg Asp Glu Glu Lys Val Phe Val Asn Asn Thr Thr Cys Val 
195 200 205 

30 Leu Asn Asp Pro Asn Phe Val Leu He Gly Ser Phe Val Ala Phe Phe 

210 215 220 

He Pro Leu Thr He Met Val He Thr Tyr Cys Leu Thr He Tyr Val 
225 230 235 240 

Leu Arg Arg Gin Ala Leu Met Leu Leu His Gly His Thr Glu Glu Pro 
35 245 250 255 

Pro Gly Leu Ser Leu Asp Phe Leu Lys Cys Cys Lys Arg Asn Thr Ala 
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260 265 270 

Glu Glu Glu Asn Ser Ala Asn Pro Asn Gin Asp Gin Asn Ala Arg Arg 
275 280 285 

Arg Lys Lys Lys Glu Arg Arg Pro Arg Gly Thr Met Gin Ala He Asn 
290 295 300 

Asn Glu Arg Lys Ala Lys Lys Val Leu Gly lie Val Phe Phe Val Phe 
305 310 315 320 

Leu lie Met Trp Cys Pro Phe Phe He Thr Asn He Leu Ser Val Leu 
. 325 330 335 

Cys Glu Lys Ser Cys Asn Gin Lys Leu Met Glu Lys Leu Leu Asn Val 
340 345 350 

Phe Val Trp He Gly Tyr Val Cys Ser Gly He Asn Pro Leu Val Tyr 
355 360 365 

Thr Leu Phe Asn Lys He Tyr Arg Arg Ala Phe Ser Asn Tyr Leu Arg 
370 375 380 

Cys Asn Tyr Lys Val Glu Lys Lys Pro Pro Val Arg Gin He Pro Arg 
385 390 395 400 

Val Ala Ala Thr Ala Leu Ser Gly Arg Glu Leu Asn Val Asn He Tyr 
405 410 415 

Arg His Thr Asn Glu Pro Val He Glu Lys Ala Ser Asp Asn Glu Pro 
420 425 430 

Gly He Glu Met Gin Val Glu Asn Leu Glu Leu Pro Val Asn Pro Ser 
435 440 445 

Ser Val Val Ser Glu Arg He Ser Ser Val 
450 455 

{232) INFORMATION FOR SEQ ID NO: 231: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1068 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 231: 

ATGGATCAGT TCCCTGAATC AGTGACAGAA AACTTTGAGT ACGATGATTT GGCTGAGGCC 60 

TGTTATATTG GGGACATCGT GGTCTTTGGG ACTGTGTTCC TGTCCATATT CTACTCCGTC 120 

ATCTTTGCCA TTGGCCTGGT GGGAAATTTG TTGGTAGTGT TTGCCCTCAC CAACAGCAAG 180 
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AAGCCCAAGA GTGTCACCGA CATTTACCTC CTGAACCTCG CCTTGTCTGA TCTGCTGTTT 240 

GTAGCCACTT TGCCCTTCTG GACTCACTAT TTGATAAATC AAAAGGGCCT CCACAATQCC ' 300 

AT6TGCAAAT TCACTACCGC CTTCTTCTTC ATCGGCTTTT TTGGAAGCAT ATTCTTCATC 360 

ACCGTCATCA GCATTGATAG GTACCTGGCC ATCGTCCTCG CCGCCAACTC CATGAACAAC 420 

CGGACCGTGC AGCATGGCGT CaVCCATCAGC CTAGGCGTCT GGGCAGCAGC CATTTTGGTG 480 

GCAGCACCCC AGTTCATGTT CACAAAGCAG AAAGAAAATG AATQCCTTGG TGACTACCCC 540 

GAGGTCCTCC AGGAAATCTG GCCCGTGCTC CGCAATGTGG AAACAAATTT TCTTGGCTTC 600 

CTACTCCCCC TGCTCATTAT GAGTTATTGC TACTTCAGAA TCATCCAGAC GCTGTTTTCC 660 

TGCAAGAACC ACAAGAAAGC CAAAGCCAAG AAACTGATCC TTCTGGTGGT CATCGTGTTT 720 

TTCCTCTTCT GGACACCCTA CAACGTTAT6 ATTTTCCTGG AGACGCTTAA GCTCTATGAC 780 

TTCTTTCCCA GTTGTGACAT GAGGAAGGAT CTGAGGCTGG CCCTCAGTGT GACTGAGACG 840 

GTTGCATTTA GCCATTGTTG CCTGAATCCT CTCATCTATG CATTTGCTGG GGAGAAGTTC 900 

AGAAGATACC TTTACCACCT- GTATGGGAAA TGCCTGGCTG TCCTGTGTGG GCGCTCAGTC 960 

CACGTTGATT TCTCCTCATC TGAATCACAA AGGAGCAGGC ATGGAAGTGT TCTGAGCAGC 1020 

AATTTTACTT ACCACACGAG TGATGGAGAT GCATTGCTCC TTCTCTGA 1068 
(233) INFORMATION FOR SEQ ID NO : 232: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECDLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 2: 

Met Asp Gin Phe Pro Glii Ser Val Thr Glu Asn Phe Glu Tyr Asp Asp 
1 5 10 IS 

Leu Ala Glu Ala Cys Tyr He Gly Asp He Val Val Phe Gly Thr Val 
20 25 30 

Phe Leu ser He Phe Tyr Ser Val He Phe Ala He Gly Leu Val Gly 
35 40 45 

Asn Leu Leu Val Val Phe Ala, Leu Thr Asn Ser Lys Lys Pro Lys Ser 
5° 55 60 

Val Thr Asp He Tyr Leu Leu Asn Leu Ala Leu Ser Asp Leu Leu Phe 
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65 70 75 80 

Val Ala Thr Leu Pro Phe Trp Thr His Tyr Leu He Asn Glu Lys Gly 
85 90 95 

Leu His Asn Ala Met Cys Lys Phe Thr Thr Ala Phe Phe Phe He Gly 
5 100 105 110 

Phe Phe Gly Ser He Phe Phe He Thr Val He Ser He Asp Arg Tyr 
115 120 125 

Leu Ala He Val Leu Ala Ala Asn Ser Met Asn Asn Arg Thr Val Gin 
130 135 140 

10 His Gly Val Thr He Ser Leu Gly Val Trp Ala Ala Ala He Leu Val 

145 150 155 160 

Ala Ala Pro Gin Phe Met Phe Thr Lys Gin Lys Glu Asn Glu Cys Leu 
165 170 175 



15 



Gly Asp Tyr Pro Glu Val Leu Gin Glu He Trp Pro Val Leu Arg Asn 
180 185 190 

Val Glu Thr Asn Phe Leu Gly Phe Leu Leu Pro Leu Leu He Met Ser 
195 200 205 

Tyr Cys Tyr Phe Arg He He Gin Thr Leu Phe Ser Cys Lys Asn His 
210 215 220 

20 Lys Lys Ala Lys Ala Lys Lys Leu He Leu Leu Val Val He Val Phe 

225 230 235 . 240 

. Phe Leu Phe Tirp Thr Pro Tyr Asn Val Met He Phe Leu Glu Thr Leu 
245 250 255 

Lys Leu Tyr Asp Phe Phe Pro Ser Cys Asp Met Arg Lys Asp Leu Arg 
25 260 265 270 

Leu Ala Leu Ser Val Thr Glu Thr Val Ala Phe Ser His Cys Cys Leu 
275 280 285 

Asn Pro Leu He Tyr Ala Phe Ala Gly Glu Lys Phe Arg Arg Tyr Leu 
290 295 300 



30 



35 



Tyr His Leu Tyr Gly Lys Cys Leu Ala Val Leu Cys Gly Arg Ser Val 
305 310 315 320 

His Val Asp Phe Ser Ser Ser Glu Ser Gin Arg Ser Arg His Gly Ser 
325 330 

Val Leu Ser Ser Asn Phe Thr Tyr His Thr Ser Asp Gly Asp Ala Leu 
340 345 350 

Leu Leu Leu 

355 ^ 
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(234) INFORMATION FOR SEQ ID NO: 233: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 233: 
10 GGCTTAAGAG CATCATCGTG GTGCTGGTG 29 

(235) INFORMATION FOR SEQ ID NO:234: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 
IS (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI -SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 234: 
20 GTCACCACCA GCACCACGAT GATGCTCTTA AGCC 34 

(236) INFORMATION FOR SEQ ID NO:235: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOIiOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:235: 

CAAAGAAAGT ACTGGGCATC GTCTTCTTCC T 3I 
30 (237) INFORMATION FOR SEQ ID NO:236: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:236: 
TGCTCTAGAT TCCAGATAGG TGAAAACTTG 30 

(238) INFORMATION FOR SEQ ID NO. 237: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:237: 
CTAGGGGCAC CATGCAGGCT ATCAACAATG AAAGAAAAGC TAAGAAAGTC 50 

(239) INFORMATION FOR SEQ ID NO:238: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:238: 
CAAGGACTTT CTTAGCTTTT CTTTCATTGT TGATAGCCTG CATGGTGCCC 50 

(240) INFORMATION FOR SEQ ID NO: 239: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 239: 
CGGCGGCAGA AGGCGAAACG CATGATCCTC GCGGT 35 

(241) INFORMATION FOR SEQ ID NO: 240: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 240: 
ACCGCGAGGA TCATGCGTTT CGCCTTCTGC CGCCG 
(242) INFORMATION FOR SEQ ID N0:241: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241: 
GAGACATATT ATCTGCCACG GAGG 

(243) INFORMATION FOR SEQ ID NO:242: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: 
TTGGCATAGA AACCGGACCC AAGG 

(244) INFORMATION FOR SEQ ID NO:243: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:243: 
TAAGAATTCC ATAAAAATTA TGGAATGG 
(245) INFORMATION FOR SEQ ID N0:244: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 244: 
CCAGGATCCA GCTGAAGTCT TCCATCATTC 30 
(246) INFORMATION FOR SEQ ID NO: 245: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1071 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:245: 

ATGAATGGGG TCTCGGAGGG GACCAGAGGC TGCAGTGACA GGCAACCTGG GGTCCTGACA 60 

CGTGATCGCT CTTGTTCCAG GAAGATGAAC TCTTCCGGAT GCCTGTCTGA GGAGGTGGGG 120 

TCCCTCCGCC CACTGACTGT GGTTATCCTG TCTGCGTCCA TTGTCGTCGG AGTGCTGGGC 180 

AATGGGCTGG TGCTGTGGAT GACTGTCTTC CGTATGGCAC GCACGGTCTC CACCGTCTGC 240 

20 TTCTTCCACC TGGCCCTTGC CGATTTCATG CTCTCACTGT CTCTGCCCAT TGCCATGTAC 300 

TATATTGTCT CCAGGCAGTG GCTCCTCGGA GAGTGGGCCT GCAAACTCTA CATCACCTTT 360 

GTGTTCCTCA GCTACTTTGC CAGTAACTGC CTCCTTGTCT TCATCTCTGT GGACCGTTGC 420 

ATCTCTGTCC TCTACCCCGT CTGGGCCCTG AACCACCGCA CTGTGCAGCG GGCGAGCTGG 480 

CTGGCCTTTG GGGTGTGGCT CCTGGCCGCC GCCTTGTGCT CTGCGCACCT GAAATTCCGG 540 

25 ACAACCAGAA AATGGAATGG CTGTACGCAC TGCTACTTGG CGTTCAACTC TGACAATGAG 600 

ACTGCCCAGA TTTGGATTGA AGGGGTCGTG GAG6GACACA TTATAGGGAC CATTGGCCAC 660 

TTCCTGCTGG GCTTCCTGGG GCCCTTAGCA ATCATAGGCA CCTGCGCCCA CCTCATCCGG 720 

GCCAAGCTCT TGCGGGAGGG CTGGGTCCAT GCCAACCGGC CCGCGAGGCT GCTGCTGGTG 780 

CTGGTGAGCG CTTTCTTTAT CTTCTGGTCC CCGTTTAACG TG6TGCTGTT GGTCCATCTG 840 

30 TGGCGACGGG TGATGCTCAA GGAAATCTAC CACCCCCGGA TGCTGCTCAT CCTCCAGGCT 900 

AGCTTTGCCT TGGGCTGTGT CAACAGCAGC CTCAACCCCT TCCTCTACGT CTTCGTTGGC 960 
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AGAGATTTCC AAGAAAAGTT TTTCCAGTCT TTGACTTCTG CCCTGGCGAG GGCGTTTGGA 1020 
GAGGAGGAGT TTCTGTCATC CTGTCCCCGT GGCAACGCCC CCCGGGAATG A 1071 
(247) INFORMATION FOR SEQ ID NO:246: 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 3 56 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:246: 

Met Asn Gly Val Ser Glu Gly Thr Arg Gly Cys Ser Asp. Arg Gin Pro 
1 5 10 15 

Gly Val Leu Thr Arg Asp Arg Ser Cys Ser Arg Lys Met Asn Ser Ser 
20 25 30 

15 Gly Cys Leu Ser Glu Glu Val Gly Ser Leu Arg Pro Leu Thr Val Val 

35 40 45 

He Leu Ser Ala Ser He Val Val Gly Val Leu Gly Asn Gly Leu Val 
50 55 60 

Leu Trp Met Thr Val Phe T^g Met Ala Arg Thr Val Ser Thr Val Cys 
20 65 70 75 80 

Phe Phe His Leu Ala Leu Ala Asp Phe Met Leu Ser Leu Ser Leu Pro 
85 90 95 

He Ala Met Tyr Tyr He Val Ser Arg Gin Trp Leu Leu Gly Glu Trp 
100 105 110 

25 Ala Cys Lys Leu Tyr He Thr Phe Val Phe Leu Ser Tyr Phe Ala Ser 

115 120 125 

Asn Cys Leu Leu Val Phe He Ser Val Asp Arg Cys He Ser Val Leu 
130 135 140 

Tyr Pro Val Trp Ala Leu Asn His Arg Thr Val Gin Arg Ala Ser Trp 
30 145 150 155 160 

Leu Ala Phe Gly Val Trp Leu Leu Ala Ala Ala Leu Cys Ser Ala His 
165 170 175 

Leu Lys Phe Arg Thr Thr Arg Lys Trp Asn Gly Cys Thr His Cys Tyr 
180 185 190 

35 Leu Ala Phe Asn Ser Asp Asn Glu Thr Ala Gin He Trp He Glu Gly 

195 200 205 
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Val Val Glu Gly His He He Gly Thr He Gly His Phe Leu Leu Gly 
210 215 220 

Phe Leu Gly Pro Leu Ala He He Gly Thr Cys Ala His Leu He Arg 

230 235 240 

Ala Lys Leu Leu Arg Glu Gly Trp Val His Ala Asn Arg Pro Ala Arg 
245 250 255 

Leu Leu Leu Val Leu Val Ser Ala Phe Phe He Phe Trp Ser Pro Phe 
260 265 270 

Asn Val Val Leu Leu Val His Leu Trp Arg Arg Val Met Leu Lys Glu 
275 280 285 

He Tyr His Pro Arg Met Leu Leu He Leu Gin Ala Ser Phe Ala Leu 
290 295 300 

Gly Cys Val Asn Ser Ser Leu Asn Pro Phe Leu Tyr Val Phe Val Gly 

310 315 320 

Arg Asp Phe Gin Glu Lys Phe Phe Gin Ser Leu Thr Ser Ala Leu Ala 
325 330 335 

Arg Ala Phe Gly Glu Glu Glu Phe Leu Ser Ser Cys Pro Arg Gly Asn 
340 345 

Ala Pro Arg Glu 
355 



(248) INFORMATION FOR SEQ ID N0:247: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 247: 
GCAGAATTCG GCGGCCCCAT GGACCTGCCC CC 

(249) INFORMATION FOR SEQ ID NO:248: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 248: 
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GCTGGATCCC CCGAGCAGTG GCGTTACTTC 30 

(250) INFORMATION FOR SEQ ID N0:249: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 903 base pairs 

5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: . SEQ ID NO: 249: . 

10 ATGGACCTGC CCCCGCAGCT CTCCTTCGGC CTCTATGTGG CCGCCTTTGC GCTGGGCTTC 60 

CCGCTCAACG TCCTGGCCAT CCGAGGCGCG ACGGCCCACG CCCGGCTCCG TCTCACCCCT 120 

AGCCTGGTCT ACGCCCTGAA CCTGGGCTGC TCCGACCTGC TGCTGACAGT CTCTCTGCCC 180 

CTGAAGGCGG TGGAGGCGCT AGCCTCCGGG GCCTGGCCTC TGCCGGCCTC GCTGTGCCCC 240 

GTCTTCGCGG TGGCCCACTT CTTCCCACTC TATGCCGGCG GGGGCTTCCT GGCCGCCCTG 300 

15 AGTGCAGGCC GCTACCTGGG AGCAGCCTTC CCCTTGGGCT ACCAAGCCTT CCGGAGGCCG 360 

TGCTATTCCT GGGGGGTGTG CGCGGCCATC TGGGCCCTCG TCCTGTGTCA CCTGGGTCTG 420 

GTCTTTGGGT TGGAGGCTCC AGGAGGCTGG CTGGACCACA GCAACACCTC CCTGGGCATC 480 

AACACACCGG TCAACGGCTC TCCGGTCTGC CTGGAGGCCT GGGACCCGGC CTCTGCCGGC 540 

CCGGCCCGCT TCAGCCTCTC TCTCCTGCTC TTTTTTCTGC CCTTGGCCAT CACAGCCTTC 600 

20 TGCTACGTGG GCTGCCTCCG GGCACTGGCC CGCTCCGGCC TGACGCACAG GCGGAAGCTG 660 

CGGGCCGCCT GGGTGGCCGG CGGGGCCCTC CTCACGCTGC TGCTCTGCGT AGGACCCTAC 720 

AACGCCTCCA ACGTGGCCAG CTTCCTGTAC CCCAATCTAG GAGGCTCCTG GCGGAAGCTG 780 

GGGCTCATCA CGGGTGCCTG GAGTGTGGTG CTTAATCCGC TGGTGACCGG TTACTTGGGA 840 

AGGGGTCCTG GCCTGAAGAC AGTGTGTGCG GCAAGAACGC AAGGGGGCAA GTCCCAGAAG 900 

25 TAA 903 

(251) INFORMATION FOR SEQ ID N0:250: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 amino acids 

(B) TYPE: amino acid 
30 (C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 
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(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:250: 

Met Asp Leu Pro Pro Gin Leu Ser Phe Gly Leu Tyr Val Ala Ala Phe 
15 10 15 

,5 Ala Leu Gly Phe Pro Leu Asn Val Leu Ala lie Arg Gly Ala Thr Ala 

20 25 30 

His Ala Arg Leu Arg Leu Thr Pro Ser Leu Val Tyr Ala Leu Asn Leu 
35 40 45 

Gly Cys Ser Asp Leu Leu Leu Thr Val Ser Leu Pro Leu Lys Ala Val 
10 50 55 60 

Glu Ala Leu Ala Ser Gly Ala Trp Pro Leu Pro Ala Ser Leu Cys Pro 
65 70 75 80 

Val Phe Ala Val Ala His Phe Phe Pro Leu Tyr Ala Gly Gly Gly Phe 
85 90 95 

15 Leu Ala Ala Leu Ser Ala Gly Arg Tyr Leu Gly Ala Ala Phe Pro Leu 

100 105 110 

Gly Tyr Gin Ala Phe Arg Arg Pro Cys Tyr Ser Trp Gly Val Cys Ala 
115 120 125 

Ala lie Trp Ala Leu Val Leu Cys His Leu Gly Leu Val Phe Gly Leu 
20 130 135 140 

Glu Ala Pro Gly Gly Trp Leu Asp His Ser Asn Thr Ser Leu Gly lie 
145 150 155 160 

Asn Thr Pro Val Asn Gly Ser Pro Val Cys Leu Glu Ala Trp Asp Pro 
165 170 175 

25 Ala Ser Ala Gly Pro Ala Arg Phe Ser Leu Ser Leu Leu Leu Phe Phe 

180 185 190 

Leu Pro Leu Ala lie Thr Ala Phe Cys Tyr Val Gly Cys Leu Arg Ala 
195 200 205 

Leu Ala Arg Ser Gly Leu Thr His Arg Arg Lys Leu Arg Ala Ala Trp 
30 210 215 220 

Val Ala Gly Gly Ala Leu Leu Thr Leu Leu Leu Cys Val Gly Pro Tyr 
225 230 235 240 

Asn Ala Ser Asn Val Ala Ser Phe Leu Tyr Pro Asn Leu Gly Gly Ser 
245 250 255 

35 Trp Arg Lys Leu Gly Leu lie Thr Gly Ala Trp Ser Val Val Leu Asn 

260 265 270 

Pro Leu Val Thr Gly Tyr Leu Gly Arg Gly Pro Gly Leu Lys Thr Val 
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275 280 285 

Cys Ala Ala Arg Thr Gin Gly Gly Lys Ser Gin Lys 

295 300 

(252) INFORMATION FOR SEQ ID NO: 251: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: DMA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 251: 

CTCT^GCTTA CTCTCTCTCA CCAGTGGCCA C 

(253) INFORMATION FOR SEQ ID NO: 252: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



25 



(ii) MOLECULE TYPE.: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252 
CCCTCCTCCC CCGGAGGACC TAGC 
(254) INFORMATION FOR SEQ ID NO:253: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1041 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



31 



24 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:253: 

30 ATGGATACAG GCCCCGACCA GTCCTACTTC TCCGGCAATC ACTGGTTCGT CTTCTCGGTG 60 

TACCTTCTCA CTTTCCTGGT GGGGCTCCCC CTCAACCTGC TGGCCCTGGT GGTCTTCGTG 120 

GGCAAGCTGC AGCGCCGCCC GGTGGCCGTG GACGT6CTCC TGCTCAACCT GACCGCCTCG 180 

GACCTGCTCC TGCTGCTGTT CCTGCCTTTC C6CATGGTGG AGGCAGCCAA TGGCATGCAC 240 

TGGCCCCTGC CCTTCATCCT CTGCCCACTC TCTGGATTCA TCTTCTTCAC CACCATCTAT 300 
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CTCACCGCCC TCTTCCTGGC AGCTGTGAGC ATTGAACGCT TCCTGAGTGT GGCCCACCCA 360 
CTGTGGTACA AGACCCGGCC GAGGCTGGGG CAGGCAGGTC TGGTGAGTGT GGCCTGCTGG 420 
CTGTTGGCCT CTGCTCACTG CAGCGTGGTC TACGTCATAG AATTCTCAGG GGACATCTCC 480 
CACAGCCAGG GCACCAATGG GACCTGCTAC CTGGAGTTCC GGAAGGACCA GCTAGCCATC 540 
5 CTCCTGCCCG TGCGGCTGGA GATGGCTGTG GTCCTCTTTG TGGTCCCGCT GATCATCACC 600 

AGCTACTGCT ACAGCCGCCT GGTGTGGATC CTCGGCAGAG GGGGCAGCCA CCGCCGGCAG 660 
AGGAGGGTGG CGGGGCTGTT GGCGGCCACG CTGCTCAACT TCCTTGTCTG CTTTGGGCCC 720 

TACAACGTGT CCCATGTCGT GGGCTATATC TGCGGTGAAA GCCCGGCATG GAGGATCTAC 780 

GTGACGCTTC TCAGCACCCT GAACTCCTGT GTCGACCCCT TTGTCTACTA CTTCTCCTCC 840 

10 TCCGGGTTCC AAGCCGACTT TCATGAGCTG CTGAGGAGGT TGTGTGGGCT CTGGGGCCAG 900 

TGGCAGCAGG AGAGCAGCAT GGAGCTGAAG GAGCAGAAGG GAGGGGAGGA GCAGAGAGCG 960 

GACCGACCAG CTGAAAGAAA GACCAGTGAA CACTCACAGG GCTGTGGAAC TGGTGGCCAG 1020 

GTGGCCTGTG CTGAAAGCTA G 1041 
(255) INFORMATION FOR SEQ ID NO: 2 54: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) IiENGTH: 346 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: not relievant 

20 (ii) MOLECULE TYPE; protein 

(xi) iSEQUENCE DESCRIPTION: SEQ ID NO: 254: 

Met Asp Thr Gly Pro Asp Gin Ser Tyr Phe Ser Gly Asn His Trp Phe 
^5 10 15 



25 



30 



Val Phe Ser Val Tyr Leu Leu Thr Phe Leu Val Gly Leu Pro Leu Asn 
20 25 30 

Leu Leu Ala Leu Val Val Phe Val Gly Lys Leu Gin Arg Arg Pro Val 
35 40 45 

Ala Val Asp Val Leu Leu Leu Asn Leu Thr Ala Ser Asp Leu Leu Leu 
50 55 60 

Leu Leu Phe Leu Pro Phe Arg Met Val Glu Ala Ala Asn Gly Met His 
65 70 75 80 

Trp Pro Leu Pro Phe He Leu Cys Pro Leu Ser Gly Phe He Phe Phe 
85 90 95 
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Thr Thr He Tyr Leu Thr Ala Leu Phe Leu Ala Ala Val Ser He Glu 
100 105 110 

Arg Phe Leu Ser Val Ala His Pro Leu Trp Tyr Lys Thr Arg Pro Arcr 
115 12C 125 

Leu Gly Gin Ala Gly Leu Val Ser Val Ala Cys Trp Leu Leu Ala Ser 
130 135 140 

Ala His Cys Ser Val Val Tyr Val He Glu Phe Ser Gly Asp He Ser 
"5 150 155 160 

His Ser Gin Gly Thr Asn Gly Thr Cys Tyr Leu Glu Phe Arg Lys Asp 
3-65 170 175 

Gin Leu Ala He Leu Leu Pro Val Arg Leu Glu Met Ala Val Val Leu 
180 185 190 

Phe Val Val Pro Leu He He Thr Ser Tyr Cys Tyr Ser Arg Leu Val 
195 200 205 

Trp He Leu Gly Arg Gly Gly Ser His Arg Arg Gin Arg Arg Val Ala 
210 215 220 

Gly Leu Leu Ala Ala Thr Leu Leu Asn Phe Leu Val Cys Phe Gly Pro 

230 235 240 

Tyr Asn Val Ser His Val Val Gly Tyr He Cys Gly Glu Ser Pro Ala 
245 250 255 

Trp Arg He Tyr Val Thr Leu Leu Ser Thr Leu Asn Ser Cys Val Asp 
260 265 270 

Pro Phe Val Tyr Tyr Phe Ser Ser Ser Gly Phe Gin Ala Asp Phe His 
275 280 . 285 

Glu Leu Leu Arg Arg Leu Cys Gly Leu Trp Gly Gin Trp Gin Gin Glu 
290 295 300 

Ser Ser Met Glu Leu Lys Glu Gin Lys Gly Gly Glu Glu Gin Arg Ala 

310 315 320 

Asp Arg Pro Ala Glu Arg Lys Thr Ser Glu His Ser Gin Gly Cys Gly 
325 330 335 

Thr Gly Gly Gin Val Ala Cys Ala Glu Ser 
340 345 

(256) INFORMATION FOR SEQ ID NO:255: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:255: 
TTTAAGCTTC CCCTCCAGGA TGCTGCCGGA C 31 

(257) INFORMATION FOR SEQ ID NO:256: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: not relevant 

10 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:256: 
GGCGAATTCT GAAGGTCCAG GGAAACTGCT A 31 

(258) INFORMATION FOR SEQ ID NO: 257: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:257: 

ATGCTGCCGG ACTGGAAGAG CTCCTTGATC CTCATGGCTT ACATCATCAT CTTCCTCACT 60 

GGCCTCCCTG CCAACCTCCT GGCCCTGCGG GCCTTTGTGG GGCGGATCCG CCAGCCCCAG 12 0 

CCTGCACCTG TGCACATCCT CCTGCTGAGC CTGACGCTGG CCGACCTCCT CCTGCTGCTG 180 

CTGCTGCCCT TCAAGATCAT CGAGGCTGCG TCGAACTTCC GCTGGTACCT GCCCAAGGTC 240 

15 GTCTGCGCCC TCACGAGTTT TGGCTTCTAC AGCAGCATCT ACTGCAGCAC GTGGCTCCTG 300 

GCGGGCATCA GCATCGAGCG CTACCTGGGA GTGGCTTTCC CCGTGCAGTA CAAGCTCTCC 360 

CGCCGGCCTC TGTATGGAGT GATTGCAGCT CTGGTGGCCT GGGTTATGTC CTTTGGTCAC 420 

TGCACCATCG TGATCATCGT TCAATACTTG AACACGACTG AGCAGGTCAG AAGTGGCAAT 480 

GAAATTACCT GCTACGAGAA CTTCACCGAT AACCAGTTGG ACGTGGTGCT GCCCGTGCGG 540 

lO CTGGAGCTGT GCCTGGTGCT CTTCTTCATC CCCATGGCAG TCACCATCTT CTGCTACTGG 600 

CGTTTTGTGT GGATCATGCT CTCCCAGCCC CTTGTGGGG6 CCCAGAGGCG GCGCCGAGCC 660 

GTGGGGCTGG CTGTGGTGAC GCTGCTCAAT TTCCTGGTGT GCTTCGGACC TTACAACGTG 720 
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TCCCACCTGG TGGGGTATCA CCAGAGAAAA AGCCCCTGGT GGCGGTCAAT AGCCGTGGTG 780 

TTCAGTTCAC TCAACGCCAG TCTGGACCCC CTGCTCTTCT ATTTCTCTTC TTCAGTGGTG 840 

CGCAGGGCAT TTGGGAGAGG GCTGCAGGTG CTGCGGAATC AGGGCTCCTC CCTGTTGGGA 900 

CGCAGAGGCA AAGACACAGC AGAGGGGACA AATGAGGACA GGGGTGTGGG TCAAGGAGAA 960 

5 GGGATGCCAA GTTCGGACTT CACTACAGAG TAG 993 
(259) INFORMATION FOR SEQ ID N0:258: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 362 amino acids 

(B) TYPE: amino acid 
10 (C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECXJLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:258: 

Met Leu Pro Asp Trp Lys Ser Ser Leu He Leu Met Ala Tyr He He 
15 1 5 10 15 

He Phe Leu Thr Gly Leu Pro Ala Asn Leu Leu Ala Leu Arg Ala Phe 
20 25 30 

Val Gly Arg He Arg Gin Pro Gin Pro Ala Pro Val His He Leu Leu 
35 40 45 

20 Leu Ser Leu Thr Leu Ala Asp Leu Leu Leu Leu Leu Leu Leu Pro Phe 

50 55 60 

Lys lie He Glu Ala Ala Ser Asn Phe Arg Trp Tyr Leu Pro Lys Val 
65 70 75 80 

Val Cys Ala Leu Thr Ser Phe Gly Phe Tyr Ser Ser He Tyr Cys Ser 
85 90 95 

Thr Trp Leu Leu Ala Gly He Ser He Glu Arg Tyr Leu Gly Val Ala 
106 105 110 

Phe Pro Val Gin Tyr Lys Leu Ser Arg Arg Pro Leu Tyr Gly Val He 
115 120 125 

Ala Ala Leu Val Ala Trp Val Met Ser Phe Gly His Cys Thr He Val 
130 135 140 

He He Val Gin Tyr Leu Asn Thr Thr Glu Gin Val Arg Ser Gly Asn 

150 155 160 

Glu He Thr Cys Tyr Glu Asn Phe Thr Asp Asn Gin Leu Asp Val Val 
35 165 170 175, 



25 



30 
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Leu Pro Val Arg Leu Glu Leu Cys Leu Val Leu Phe Phe He Pro Met 
180 185 190 

Ala Val Thr He Phe Cys Tyr Trp Arg Phe Val Trp He Met Leu Ser 
195 200 205 

5 Gin Pro Leu Val Gly Ala Gin Arg Arg Arg Arg Ala Val Gly Leu Ala 

. 210 215 220 

Val Val Thr Leu Leu Asn Phe Leu Val Cys Phe Gly Pro Tyr Asn Val 
225 230 235 240 

Ser His Leu Val Gly Tyr His Gin Arg Lys Ser Pro Trp Trp Arg Ser 
10 245 250 255 

He Ala Val Val Phe Ser Ser Leu Asn Ala Ser Leu Asp Pro Leu Leu 
260 265 270 

Phe Tyr Phe Ser Ser Ser Val Val Arg Arg Ala Phe Gly Arg Gly Leu 
275 280 285 

15 Gin Val Leu Arg Asn Gin Gly Ser Ser Leu Leu Gly Arg Arg Gly Lys 

290 295 300 

Asp Thr Ala Glu Gly Thr Asn Glu Asp Arg Gly Val Gly Gin Gly Glu 
305 310 315 320 

Gly Met Pro Ser Ser Asp Phe Thr Thr Glu 
20 325 330 

(260) INFORMATION FOR SEQ ID NO:259: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 
25 (C) . STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 259: 
CCCAAGCTTC GGGCACCATG GACACCTCCC 30 
30 (261) INFORMATION FOR SEQ ID NO: 260: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 



(ii) MOLEOJLE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 260: 
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ACAGGATCCA AATGCACAGC ACTGGTAAGC 

(262) INFORMATION FOR SEQ ID NO: 261: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 

5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 261: 

10 CTATAACTGG GTTACATGGT TTAAC 

(263) INFORMATION FOR SEQ ID N0:262: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 62: 

TTTGAATTCA CATATTAATT AGAGACATGG 
20 (264) INFORMATION FOR SEQ ID NO: 263: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2724 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 263: 

ATGGACACCT CCCGGCTCGG TGTGCTCCTG TCCTTGCCTG TGCTGCTGCA GCTGGCGACC 60 

GGGGGCAGCT CTCCCAGGTC TGGTGTGTTG CTGAGGGGCT GCCCCACACA CTGTCATTGC 120 

30 GAGCCCGACG GCAGGATGTT GCTCAGGGTG GACTGCTCCG ACCTGGGGCT CTCGGAGCTG 180 

CCTTCCAACC TCAGCGTCTT CACCTCCTAC CTAGACCTCA GTATGAACAA CATCAGTCAG 240 

CTGCTCCCGA ATCCCCTGCC CAGTCTCCGC TTCCTGGAGG AGTTACGTCT TGCGGGAAAC 300 

GCTCTGACAT ACATTCCCAA GGGAGCATTC ACTGGCCTTT ACAGTCTTAA AGTTCTTATG 360 



30 
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CTGCAGAATA ATCAGCTAAG ACACGTACCC ACAGAAGCTC TGCAGAATTT GCGAAGCCTT 420 
CAATCCCTGC GTCTGGATGC TAACCACATC AGCTATGTGC CCCCAAGCTG TTTCAGTGGC 480 
CTGCATTCCC TGAGGCACCT GTGGCTGGAT GACAATGCGT TAACAGAAAT CCCCGTCCAG 540 
GCTTTTAGAA GTTTATCGGC ATTGCAAGCC ATGACCTTGG CCCTGAACAA AATACACCAC 600 
5 ATACCAGACT ATGCCTTTGG TIAACCTCTCC AGCTTGGTAG TTCTACATCT CCATAACAAT 660 
AGJVATCCACT CCCTGGGAAA GAAATGCTTT 6ATGGGCTCC ACAGCCTAGA GACTTTAGAT 720 
TTAAATTACA ATAACCTTGA TGAATTCCCC ACTGCAATTA GGACACTCTC CAACCTTAAA 780 
GAACTAGGAT TTCATAGCAA CAATATCAGG TCGATACCTG AGAAAGCATT TGTAGGCAAC 840 
CCTTCTCTTA TTACT^TACA TTTCTATGAC AATCCCATCC AATTTGTTGG GAGATCTGCT 900 
10 TTTCAACATT TACCTGAACT AAGAACACTG ACTCTGAATG GTGCCTCACA AATAACTGAA 960 
TTTCCTGATT TAACTGGAAC TGCAAACCTG GAGAGTCTGA CTTTAACTGG AGCACAGATC 1020 
TCATCTCTTC CTCAAACCGT CTGCAATCAG TTACCTAATC TCCAAGTGCT AGATCTGTCT 1080 
TACAACCTAT TAGAAGATTT ACCCAGTTTT TCAGTCTGCC AAAAGCTTCA GAAAATTGAC 1140 
CTAAGACATA ATGAAATCTA CGAAATTAAA GTTGACACTT TCCAGCAGTT GCTTAGCCTC 1200 
15 CGATCGCTGA ATTTGGCTTG GAACAAAATT GCTATTATTC ACCCCAATGC ATTTTCCACT 1260 
TTGCCATCCC TAATAAAGCT GGACCTATCG TCCAACCTCC TGTCGTCTTT TCCTATAACT 1320 
GGGTTACATG GTTTAACTCA CTTAAAATTA ACAGGAAATC ATGCCTTACA GAGCTTGATA 1380 
TCATCTGAAA ACTTTCCAGA ACTCAAGGTT ATAGAAATGC CTTATGCTTA CCAGTGCTGT 1440 
GCATTTGGAG TGTGTGAGAA TGCCTATAAG ATTTCTAATC AATGGAATAA AGGTGACAAC 1500 
20 AGCAGTATGG ACGACCTTCA TAAGT^GAT GCTGGAATGT TTCAGGCTCA AGATGAACGT 1560 
GACCTTGAAG ATTTCCTGCT TGACTTTGAG GAAGACCTGA AAGCCCTTCA TTCAGTGCAG 1620 
TGTTCACCTT CCCCAGGCCC CTTCAAACCC TGTGAACACC TGCTTGATGG CTGGCTGATC 1680 
AGAATTGGAG TGTGGACCAT AGCAGTTCTG GCACTTACTT GTAATGCTTT GGTGACTTCA 1740 
ACAGTTTTCA GATCCCCTCT GTACATTTGC CCCATTAAAC TGTTAATTGG GGTCATCGCA 1800 
25 GCAGTGAACA TGCTCACGGG AGTCTCCAGT GCCGTGCTGG CTGGTGTGGA TGCGTTCACT 1860 
TTTGGCAGCT TTGCACGACA TGGTGCCTGG TGGGAGAATG GGGTTGGTTG GCATGTCATT 1920 
GGTTTTTTGT CCATTTTTGC TTCAGAATCA TCTGTTTTCC TGCTTACTCT GGCAGCCCTG 1980 
GAGCGTGGGT TCTCTGTGAA ATATTCTGCA AAATTTGAAA CGAAAGCTCC ATTTTCTAGC 2040 
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CTGAAAGTAA TCATTTTGCT CTGTGCCCTG CTGGCCTTGA CCATGGCCGC AGTTCCCCTG 2100 
CTGGGTGGCA GCAAGTATGG CGCCTCCCCT CTCTGCCTCC CTTTCCXnTT TGGGGAGCXC 2160 
AGCACCATGG GCTACATGGT CGCTCTCATC TTGCTCAATT CCCTTTGCTT CCTCATGATG 2220 
ACCATTGCCT ACACCAAGCT CTACTGCAAT TTGGACAAGG GAGACCTGGA GAATATTTGG 2280 

5 GACTGCTCTA TGGTAAAACA CATTGCCCTG TTCCTCTTCA CCAACTGCAT CCTAAACTGC 2340 
CCTGTGGCTT TCTTGTCCTT CTCCTCTTTA ATAAACCTTA CATTTATCAG TCCTGAAGTA 2400 
ATTAAGTTTA TCCTTCTGGT GGTAGTCCCA CTTCCTGCAT GTCTCAATCC CCTTCTCTAC 2460 
ATCTTGTTCA ATCCTCACTT TAAGGAGGAT CTGGTGAGCC TGAGAAAGCA AACCTACGTC 2520 
TGGACAAGAT CAAAACACCC AAGCTTGATG TCAATTAACT CTGATGATGT CGAAAAACAG .2580 

0 TCCTGTGACT CAACTCAAGC CTTGGTAACC TTTACCAGCT CCAGCATCAC TTATGACCTG 264 0 
CCTCCCAGTT CCGTGCCATC ACCAGCTTAT CCAGTGACTG AGAGCTGCCA TCTTTCCTCT 2700 
GTGGCATTTG TCCCATGTCT CTAA 

2724 

(265) INFORMATION FOR SEQ ID NO: 264: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 907 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 264: 

Met Asp Thr Ser Arg Leu Gly Val Leu Leu Ser Leu Pro Val Leu Leu 
1 5 10 15 

Gin Leu Ala Thr Gly Gly Ser Ser Pro Arg Ser Gly Val Leu Leu Ara 
.20 25 30 

Gly Cys Pro Thr His Cys His Cys Glu Pro Asp Gly Arg Met Leu Leu 
35 40 45 

Arg Val Asp Cys Ser Asp Leu Gly Leu Ser Glu Leu Pro Ser Asn Leu 
50 55 -60 

Ser Val Phe Thr Ser Tyr Leu Asp Leu Ser Met Ash Asn He Ser Gin 
" 75 80 

Leu Leu Pro Asn Pro Leu Pro Ser Leu Arg Phe Leu Glu Glu Leu Ara 
85 90 95 

Leu Ala Gly Asn Ala Leu Thr Tyr He Pro Lys Gly Ala Phe Thr Gly 
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100 105 110 

Leu Tyr Ser Leu Lys Val Leu Met Leu Gin Asn Asn Gin Leu Arg His 
115 120 125 

Val Pro Thr Glu Ala Leu Gin Asn Leu Arg Ser Leu Gin Ser Leu Arg 
130 135 140 

Leu Asp Ala Asn His lie Ser Tyr Val Pro Pro Ser Cys Phe Ser Gly 

150 155 160 

Leu His Ser Leu Arg His Leu Trp Leu Asp Asp Asn Ala Leu Thr Glu 
165 170 175 

lie Pro Val Gin Ala Phe Arg Ser. Leu Ser Ala Leu Gin Ala Met Thr 
180 185 190 

Leu Ala Leu Asn Lys He His His He Pro Asp Tyr Ala Phe Gly Asn 
195 200 205 

Leu Ser Ser Leu Val Val Leu His Leu His Asn Asn Arg He His Ser 
210 215 220 ' 

Leu Gly Lys Lys Cys Phe Asp Gly Leu His Ser Leu Glu Thr Leu Asp 
225 230 235 240 

Leu Asn Tyr Asn Asn Leu Asp Glu Phe Pro Thr Ala He Arg Thr Leu 
245 250 255 

Ser Asn Leu Lys Glu Leu Gly Phe His Ser Asn Asn He Arg Ser He 
260 265 270 

Pro Glu Lys Ala Phe Val Gly Asn Pro Ser Leu He Thr He His Phe 
275 280 285 

Tyr Asp Asn Pro He Gin Phe Val Gly Arg Ser Ala Phe Gin His Leu 
290 295 300 

Pro Glu Leu Arg Thr Leu Thr Leu Asn Gly Ala Ser Gin He Thr Glu 
305 310 315 320 

Phe Pro Asp Leu Thr Gly Thr Ala Asn Leu Glu Ser Leu Thr Leu Thr 
325 330 335 

Gly Ala Gin He Ser Ser Leu Pro Gin Thr Val Cys Asn Gin Leu Pro 
340 345 350 

Asn Leu Gin Val Leu Asp Leu Ser Tyr Asn Leu Leu Glu Asp Leu Pro 
355 360 365 

Ser Phe Ser Val Cys Gin Lys Leu Gin Lys He Asp Leu Arg His Asn 
370 . 375 380 

Glu He Tyr Glu He Lys Val Asp Thr Phe Gin Gin Leu Leu Ser Leu 
385 390 395 4OO. 
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Arg Ser Leu Asn Leu Ala Trp Asn Lys He Ala He He His Pro Asn 
405 

Ala Phe ser Thr Leu Pro Ser Leu He Lys Leu Asp Leu Ser Ser Asn 
420 425 

Leu Leu Ser Ser Phe Pro He Thr Gly Leu His Gly Leu Thr His Leu 
435 440 

Lys Leu Thr Gly Asn His Ala Leu Gin Ser Leu He Ser Ser Glu Asn 
450 455 



Phe Pro Glu Leu Lys Val He Glu Met Pro Tyr Ala Tyr Gin Cys Cys 
^" 470 ,475 4^0 

Ala Phe Gly Val Cys Glu Asn Ala Tyr Lys He Ser Asn Gin Trp Asn 



485 



495' 



Lys Gly Asp Asri Ser Ser Met Asp Asp Leu His Lys Lys Asp Ala Gly 

505 510 

Met Phe Gin Ala Gin Asp Glu Arg Asp Leu Glu Asp Phe Leu Leu Asp 
515 520 525 

Phe Glu Glu Asp Leu Lys Ala Leu His Ser Val Gin Cys Ser Pro Ser 
530 535 540 

Pro Gly Pro Phe Lys Pro Cys Glu His Leu Leu Asp Gly Trp Leu He 

550 555 560 

Arg He Gly Val Trp Thr He Ala Val Leu Ala Leu Thr Cys Asn Ala 
565 570 575 

Leu Val Thr Ser Thr Val Phe Arg Ser Pro Leu Tyr He Ser Pro He 



580 585 



590 



Lys Leu Leu He Gly Val He Ala Ala Val Asn Met Leu Thr Gly Val 

595 600 605 

Ser Ser Ala Val Leu Ala Gly Val Asp Ala Phe Thr Phe Gly Ser Phe 

• ^^0 615 620 

Ala Arg His Gly Ala Trp Trp Glu Asn Gly Val Gly Cys His Val He 

"0 635 640 

Gly Phe Leu Ser He Phe Ala Ser Glu Ser Ser Val Phe Leu Leu Thr 



645 



650 



655 



Leu Ala Ala Leu Glu Arg Gly Phe Ser Val Lys Tyr Ser Ala Lys Phe 



660 



665 



670 



Glu Thr Lys Ala Pro Phe Ser Ser Leu Lys Val He He Leu Leu Cys 



680 



685 



Ala Leu Leu Ala Leu Thr Met Ala Ala Val Pro Leu Leu Gly Gly Ser 
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690 695 700 

Lys Tyr Gly Ala Ser Pro Leu Cys Leu Pro Leu Pro Phe Gly Glu Pro 
705 710 715 720 

Ser Thr Met Gly Tyr Met Val Ala Leu lie Leu Leu Asn Ser Leu Cys 
5 725 730 735 

Phe Leu Met Met Thr He Ala Tyr Thr Lys Leu Tyr Cys Asn Leu Asp 
740 745 750 

Lys Gly Asp Leu Glu Asn He Trp Asp Cys Ser Met Val Lys His He 
755 760 765 

10 Ala Leu Leu Leu Phe Thr Asn Cys He Leu Asn Cys Pro Val Ala Phe 

770 775 780 

Leu Ser Phe Ser Ser Leu He Asn Leu Thr Phe He Ser Pro Glu Val 
785 790 795 800 

He Lys Phe He Leu Leu Val Val Val Pro Leu Pro Ala Cys Leu Asn 
15 805 810 815 

Pro Leu Leu Tyr He Leu Phe Asn Pro His Phe Lys Glu Asp Leu Val 
820 825 830 

Ser Leu Arg Lys Gin Thr Tyr Val Trp Thr Arg Ser Lys His Pro Ser 
835 840 845 

20 Leu Met Ser He Asn Ser Asp Asp Val Glu Lys Gin Ser Cys Asp Ser 

850 855 860 

Thr Gin Ala Leu Val Thr Phe Thr Ser Ser Ser He Thr Tyr Asp Leu 
865 870 875 880 

Pro Pro Ser Ser Val Pro Ser Pro Ala Tyr Pro Val Thr Glu Ser Cys 
885 890 895 

His Leu Ser Ser Val Ala Phe Val Pro Cys Leu 
900 905 

(266) INFORMATION FOR SEQ ID NO: 265: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 265: 

CGGAAGCTGC GGGCCAAATG GGTGGCCGGC 
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(267) INFORMATION FOR SEQ ID NO:266: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 

CAGAGGAGGG TGAAGGGGCT GTTGGCG 

(268) INFORMATION FOR SEQ ID NO: 267: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 267: 
GGCGGCGCCG AGCCAAGGGG CTGGCTGTGG 

(269) INFORMATION FOR SEQ ID NO:268: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 268: 
GGGACTGCTC TATGAAAAAA CACATTGCCC TG 

(270) INFORMATION FOR SEQ ID NO:269: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1071 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 269: 



PCTAJS99/23938 



27 



30 



ATGAATGGGG TCTCGGAGGG GACCAGAGGC TGCAGTGACA GGCAACCTGG GGTCCTGACA 
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CGTGATCGCT CTTGTTCCAG GAAGATGAAC TCTTCCGGAT GCCTGTCTGA GGAGGTGGGG 120 

TCCCTCCGCC CACTGACTGT GGTTATCCTG TCTGCGTCCA TTGTCGTCGG AGTGCTGGGC 180 

AATGGGCTGG TGCTGTGGAT GACTGTCTTC CGTATGGCAC GCACGGTCTC CACCGTCTGC 240 

TTCTTCCACC TGGCCCTTGC CGATTTCATG CTCTCACTGT CTCTGCCCAT TGCCATGTAC 300 

5 TATATTGTCT CCAGGCAGTG GCTCCTCGGA GAGTGGGCCT GCAAACTCTA CATCACCTTT 360 

GTGTTCCTCA GCTACTTTGC CAGTAACTGC CTCCTTGTCT TCATCTCTGT GGACCGTTGC 420 

ATCTCTGTCC TCTACCCCGT CTGGGCCCTG AACCACCGCA CTGTGCAGCG GGCGAGCTGG 480 

CTGGCCTTTG GGGTGTGGCT CCTGGCCGCC GCCTTGTGCT CTGCGCACCT GAAATTCCGG 540 

ACAACCAGAA AATGGAATGG CTGTACGCAC TGCTACTTGG CGTTCAACTC TGACAATGAG 600 

10 ACTGCCCAGA TTTGGATTGA AGGGGTCGTG GAGGGACACA TTATAGGGAC CATTGGCCAC 660 

TTCCTGCTGG GCTTCCTGGG GCCCTTAGCA ATCATAGGCA CCTGCGCCCA CCTCATCCGG 720 

GCCAAGCTCT TGCGGGAGGG CTGGGTCCAT GCCAACCGGC CCAAGAGGCT GCTGCTGGTG 780 

CTGGTGAGCG CTTTCTTTAT CTTCTGGTCC CCGTTTAACG TGGTGCTGTT GGTCCATCTG 840 

TGGCGACGGG TGATGCTCAA GGAAATCTAC CACCCCCGGA TGCTGCTCAT CCTCCAGGCT 900 

15 AGCTTTGCCT TGGGCTGTGT CAACAGCAGC CTCAACCCCT TCCTCTACGT CTTCGTTGGC 960 

AGAGATTTCC AAGAAAAGTT TTTCCAGTCT TTGACTTCTG CCCTGGCGAG GGCGTTTGGA 1020 

GAGGAGGAGT TTCTGTCATC CTGTCCCCGT GGCAACGCCC CCCGGGAATG A 1071 
(271) INFORMATION FOR SEQ ID NO:270: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 356 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 270: 

Met Asn Gly Val Ser Glu Gly Thr Arg Gly Cys Ser Asp Arg Gin Pro 
1-5 10 15 

Gly Val Leu Thr Arg Asp Arg Ser Cys Ser Arg Lys Met Asn Ser Ser 
20 25 30 

30 Gly Cys Leu Ser Glu Glu Val Gly Ser Leu Arg Pro Leu Thr Val Val 

35 40 45 
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lie Leu Ser Ala Ser lie Val Val Gly Val Leu Gly Asn Gly Leu Val 
50 55 60 

Leu Trp Met Thr Val Phe Arg Met Ala Arg Thr Val Ser Thr Val Cys 
65 70 75 80 

Phe Phe His Leu Ala Leu Ala Asp Phe Met Leu Ser Leu Ser Leu Pro 
85 90 95 

lie Ala Met Tyr Tyr lie Val Ser Arg Gin Trp Leu Leu Gly Glu Trp 
100 105 110 

Ala Cys Lys Leu Tyr lie Thr Phe Val Phe Leu Ser Tyr Phe Ala Ser 
115 120 125 

Asn Cys Leu Leu Val Phe lie Ser Val Asp Arg Cys He Ser Val Leu 
130 135 140 

Tyr Pro Val Trp Rla. Leu Asn His Arg Thr Val Gin Arg Ala Ser Trp 
145 150 155 ISO 

Leu Ala Phe Gly Val Trp Leu Leu Ala Ala Ala Leu Cys Ser Ala His 
165 170 175 

Leu Lys Phe Arg Thr Thr Arg Lys Trp Asn Gly Cys Thr His Cys Tyr 
180 185 190 

Leu Ala Phe Asn Ser Asp Asn Glu Thr Ala Gin He Trp He Glu Gly 
195 200 205 

Val Val Glu Gly His He He Gly Thr He Gly His Phe Leu Leu Gly 
210 215 220 

Phe Leu Gly Pro Leu Ala He He Gly Thr Cys Ala His Leu He Arg 
225 230 235 240 

Ala Lys Leu Leu Arg Glu Gly Trp Val His Ala Asn Arg Pro Lys Arg 
245 . 250 255 

Leu Leu Leu Val Leu Val Ser Ala Phe Phe He Phe Trp Ser Pro Phe 
260 265 270 

Asn Val Val Leu Leu Val His Leu Trp Arg Arg Val Met Leu Lys Glu 
275 280 285 

He Tyr His Pro Arg Met Leu Leu He Leu Gin Ala Ser Phe Ala Leu 
290 295 300 

Gly Cys Val Asn Ser Ser Leu Asn Pro Phe Leu Tyr Val Phe Val Gly 
305 310 315 320 

Arg Asp Phe Gin Glu Lys Phe Phe Gin Ser Leu Thr Ser Ala Leu Ala 
325 330 335 

Arg Ala Phe Gly Glu Glu Glu Phe Leu Ser Ser Cys Pro Arg Gly Asn 
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340 345 350 

Ala Pro Arg Glu 
355 

(272) INFORMATION FOR SEQ ID NO: 2 71: 

(i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 903 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



10 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 271: 

ATGGACCTGC CCCCGCAGCT CTCCTTCGGC CTCTATGTGG CCGCCTTTGC GCTGGGCTTC 60 

CCGCTCAACG TCCTGGCCAT CCGAGGCGCG ACGGCCCACG CCCGGCTCCG TCTCACCCCT 120 

15 AGCCTGGTCT ACGCCCTGAA CCTGGGCTGC TCCGACCTGC TGCTGACAGT CTCTCTGCCC 180 

CTGAAGGCGG TGGAGGCGCT AGCCTCCGGG GCCTGGCCTC TGCCGGCCTC GCTGTGCCCC 240 

GTCTTCGCGG TGGCCCACTT CTTCCCACTC TATGCCGGCG GGGGCTTCCT GGCCGCCCTG 300 

AGTGCAGGCC GCTACCTGGG AGCAGCCTTC CCCTTGGGCT ACCAAGCCTT CCGGAGGCCG 360 

TGCTATTCCT GGGGGGTGTG CGCGGCCATC TGGGCCCTCG TCCTGTGTCA CCTGGGTCTG 420 

20 GTCTTTGGGT TGGAGGCTCC AGGAGGCTGG CTGGACCACA GCAACACCTC CCTGGGCATC 480 

AACACACCGG TCAACGGCTC TCCGGTCTGC CTGGAGGCCT GGGACCCGGC CTCTGCCGGC 540 

CCGGCCCGCT TCAGCCTCTC TCTCCTGCTC TTTTTTCTGC CCTTGGCCAT CACAGCCTTC 600 

TGCTACGTGG GCTGCCTCCG GGCACTGGCC CGCTCCGGCC TGACGCACAG GCGGAAGCTG 660 

CGGGCCAAAT . GGGTGGCCGG CGGGGCCCTC CTCACGCTGC TGCTCTGCGT AGGACCCTAC 720 

25 AACGCCTCCA ACGTGGCCAG CTTCCTGTAC CCCAATCTAG GAGGCTCCTG GCGGAAGCTG 780 

GGGCTCATCA CGGGTGCCTG GAGTGTGGTG CTTAATCCGC TGGTGACCGG TTACTTGGGA 840 

AGGGGTCCTG GCCTGAAGAC AGTGTGTGCG GCAAGAACGC AAGGGGGCAA GTCCCAGAAG 900 
TAA 

(273) INFORMATION FOR SEQ ID NO:272: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 



903 
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(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 272: 

Met Asp Leu Pro Pro Gin Leu Ser Phe Gly Leu Tyr Val Ala Ala Phe 
1 5 10 15 

Ala Leu Gly Phe Pro Leu Asn Val Leu Ala He Arg Gly Ala Thr Ala 
20 25 30 

His Ala Arg Leu Arg Leu Thr Pro Ser Leu Val Tyr Ala Leu Asn Leu 
35 40 45 

Gly Cys Ser Asp Leu Leu Leu Thr Val Ser Leu Pro Leu Lys Ala Val 
50 55 60 

Glu Ala Leu Ala Ser Gly Ala Trp Pro Leu Pro Ala Ser Leu Cys Pro 
65 70 75 80 

Val Phe Ala Val Ala His Phe Phe Pro Leu Tyr Ala Gly Gly Gly Phe 
85 90 95 

Leu Ala Ala Leu Ser Ala Gly Arg Tyr Leu Gly Ala Ala Phe Pro Leu 
100 105 110 

Gly Tyr Gin Ala Phe Arg Arg Pro Cys Tyr Ser Trp Gly Val Cys Ala 
115 120 125 

Ala He Trp Ala Leu Val Leu Cys . His Leu Gly Leu Val Phe Gly Leu 
130 135 140 

Glu Ala Pro Gly Gly Trp Leu Asp His Ser Asn Thr Ser Leu Gly He 
145 150 155 160 

Asn Thr Pro Val Asn Gly Ser Pro Val Cys Leu Glu Ala Trp Asp Pro 
165 170 175 

Ala Ser Ala Gly Pro Ala Arg Phe Ser Leu Ser Leu Leu Leu Phe Phe 
180 185 190 

Leu Pro Leu Ala He Thr Ala Phe Cys Tyr Val Gly Cys Leu Arg Ala 
195 200 205 

Leu Ala Arg Ser Gly Leu Thr His Arg Arg Lys Leu Arg Ala Lys Trp 
210 215 220 

Val Ala Gly Gly Ala Leu Leu Thr Leu Leu Leu Cys Val Gly Pro Tyr 
225 230 235 240 

Asn Ala Ser Asn Val Ala Ser Phe Leu Tyr Pro Asn Leu Gly Gly Ser 
245 250 255 

Trp Arg Lys Leu Gly Leu He Thr Gly Ala Trp Ser Val Val Leu Asn 
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260 265 270 

Pro Leu Val Thr Gly Tyr Leu Gly Arg Gly Pro Gly Leu Lys Thr Val 
275 280 285 

Cys Ala Ala Arg Thr Gin Gly Gly Lys Ser Gin Lys 
290 295 300 

(274) INFORMATION FOR SEQ ID NO: 2 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1041 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: 


SEQ ID NO: 


273: 






ATGGATACAG GCCCCGACCA GTCCTACTTC 


TCCGGCAATC 


ACTGGTTCGT 


CTTCTCGGTG 


60 


TACCTTCTCA CTTTCCTGGT 6GGGCTCCCC 


CTCAACCTGC 


TGGCCCTGGT 


GGTCTTCGTG 


120 


GGCZAAGCTGC AGCGCCGCCC GGTGGCCGTG GACGTGCTCC 


TGCTCAACCT 


GACCGCCTCG 


180 


GACCTGCTCC TGCTGCTGTT CCTGCCTTTC 


CGCATGGTGG 


AGGCAGCCAA 


TGGCATGCAC 


240 


TGGCCCCTGC CCTTCATCCT CTGCCCACTC 


TCTGGATTCA 


TCTTCTTCAC 


CACCATCTAT 


300 


CTCACCGCCC TCTTCCTGGC AGCTGTGAGC 


ATTGAACGCT 


TCCTGAGTGT 






CTGTGGTACA AGACCCGGCC GAGGCTGGGG 


CAGGCAGGTC 


TGGTGAGTGT 


GGCCTGCTGG 


420 


CTGTTGGCCT CTGCTCACTG CAGCGTGGTC TACGTCATAG AATTCTCAGG 


GGACATCTCC 


480 


CACAGCCAGG GCACCAATGG GACCTGCTAC 


CTGGAGTTCC 


GGAAGGACCA 


GCTAGCCATC 


540 


CTCCTGCCCG TGCGGCTGGA GAT6GCTGTG 


GTCCTCTTTG 


TGGTCCCGCT 


GATCATCACC 


600 


AGCTACTGCT ACAGCCGCCT GGTGTGGATC 


CTCGGCAGAG 


GGGGCAGCCA 


CCGCCGGCAG 


660 


AGGAGGGTGA AGGGGCTGTT GGCGGCCAC6 


CTGCTCAACT 


TCCTTGTCTG 


CTTTGGGCCC 


720 


TACAACGTGT CCCATGTCGT GGGCTATATC 


TGCGGTGAAA 


GCCCGGCATG 


GAGGATCTAC 


780 


GTGACGCTTC TCAGCACCCT GAACTCCTGT 


GTCGACCCCT 


TTGTCTACTA 


CTTCTCCTCC 


840 


TCCGGGTTCC AAGCCGACTT TCATGAGCTG 


CTGAGGAGGT 


TGTGTGGGCT 


CTGGGGCCAG 


900 


TGGCAGCAGG AGAGCAGCAT GGAGCTGAAG GAGCAGAAGG 


GAGGGGAGGA 


GCAGAGAGCG 


960 


GACCGACCAG CTGAAAGAAA GACCAGTGAA 


CACTCACaiGG 


GCTGTGGAAC 


TGGTGGCCAG 


1020 


GTGGCCTGTG CTGAAAGCTA G 








1041 
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(275) INFORMATION FOR SEQ ID NO: 2 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 346 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 274: 

Met Asp Thr Gly Pro Asp Gin Ser Tyr Phe Ser Gly Asn His Trp Phe 
1 5 10 



15 



Val Phe Ser Val Tyr Leu Leu Thr Phe Leu Val Gly Leu Pro Leu Asn 
20 25 30 

Leu Leu Ala Leu Val Val Phe Val Gly Lys Leu Gin Arg Arg Pro Val 
35 40 45 

Ala Val Asp Val Leu Leu Leu Asn Leu Thr- Ala Ser Asp Leu Leu Leu 
SO .55 60 

Leu Leu Phe Leu Pro Phe Arg Met Val Glu Ala Ala Asn Gly Met His 
" 70 75 80 

Trp Pro Leu Pro Phe He Leu Cys Pro Leu Ser Gly Phe He Phe Phe 
85 90 35 

Thr Thr He Tyr Leu Thr Ala Leu Phe Leu Ala Ala Val Ser He Glu 

105 110 

Arg Phe Leu Ser Val Ala His Pro Leu Trp Tyr Lys Thr Arg Pro Arg 

120 125 

Leu Gly Gin Ala Gly Leu Val Ser Val Ala Cys Trp Leu Leu Ala Ser 
130 135 



Ala His Cys Ser Val Val Tyr Val He Glu Phe Ser Gly Asp He Ser 

145 150 -ICC 

•^=>" 155 160 

His Ser Gin Gly Thr Asn Gly Thr Cys Tyr Leu Glu Phe Arg Lys Asp 



165 170 



175 



Gin Leu Ala He Leu Leu Pro Val Arg Leu Glu Met Ala Val Val Leu 
180 185 190 

Phe Val val Pro Leu He He Thr Ser Tyr Cys Tyr Ser Arg Leu Val 
195 200 205 

Trp He Leu Gly Arg Gly Gly Ser His Arg Arg Gin Arg Arg Val Lys 
210 215 220 

Gly Leu Leu Ala Ala Thr Leu Leu Asn Phe Leu Val Cys Phe Gly Pro 



10 
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225 230 235 240 

Tyr Asn Val Ser His Val Val Gly. Tyr He Cys Gly Glu Ser Pro Ala 
245 250 255 

Trp Arg He Tyr Val Thr Leu Leu Ser Thr Leu Asn Ser Cys Val Asp 
260 265 270 

Pro Phe Val Tyr Tyr Phe Ser Ser Ser Gly Phe Gin Ala Asp Phe His 
275 280 285 

Glu Leu Leu Arg Arg Leu Cys Gly Leu Trp Gly Gin Trp Gin Gin Glu 
290 295 300 

Ser Ser Met Glu Leu Lys Glu Gin Lys Gly Gly Glu Glu Gin Arg Ala 
305 310 315 320 

Asp Arg Pro Ala Glu Arg Lys Thr Ser Glu His Ser Gin Gly Cys Gly 
325 330 335 

Thr Gly Gly Gin Val Ala Cys Ala Glu Ser 
15 340 345 

(276) INFORMATION FOR SEQ ID NO: 275: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 993 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECXJLE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:275: 

ATGCTGCCGG ACTGGAAGAG CTCCTTGATC CTCATGGCTT ACATCATCAT CTTCCTCACT 60 

25 GGCCTCCCTG CCAACCTCCT GGCCCTGCGG GCCTTTGTGG GGCGGATCCG CCAGCCCCAG 120 

CCTGCACCTG TGCACATCCT CCTGCTGA6C CTGACGCTGG CCGACCTCCT CCTGCTGCTG 180 

CTGCTGCCCT TCAAGATCAT CGAGGCTGCG TCGAACTTCC GCTGGTACCT GCCCAAGGTC 240 

GTCTGCGCCC TCACGAGTTT TGGCTTCTAC AGCAGCATCT ACTGCAGCAC GTGGCTCCTG 300 

GCGGGCATCA GCATCGAGCG CTACCTGGGA GTGGCTTTCC CCGTGCAGTA CAAGCTCTCC 360 

30 CGCCGGCCTC TGTATGGAGT GATTGCAGCT CTGGTGGCCT GGGTTATGTC CTTTGGTCAC 420 

TGCACCATCG TGATCATCGT TCAATACTTG AACACGACTG AGCAGGTCAG AAGTGGCAAT 480 

GAAATTACCT GCTACGAGAA CTTCACCGAT AACCAGTTGG ACGTGGTGCT GCCCGTGCGG 540 

CTGGAGCTGT GCCTGGTGCT CTTCTTCATC CCCATGGCAG TCACCATCTT CTGCTACTGG 600 
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CGTTTTGTGT GGATCATGCT CTCCCAGCCC CTTCTGGGGG CCCAGAGGCG GCGCC6AGCC 660 

AAGGGGCTGG CTGTGGT6AC GCTGCTCAAT TTCCTGGTGT GCTTCGGACC TTACAACGTC 720 

TCCCACCTGG TGGGGTATCA CCAGAGAAAA A6CCCCTCGT GGCGGTCAAT AGCCGTGGTG 780 

TTCAGTTCAC TCAACGCCAG TCTGGACCCC CTGCTCTTCT ATTTCTCTTC TTCAGTGGTG 840 

CGCAGGGCAT TTGGGAGAGG GCtGCAGGTG CTGCGGAATC AGGGCTCCTC CCTGTTGGGA 900 

CQCAGAGGCA AAGACACAGC AGAGGGGACA AATGAGGACA GGGGTCTGGG TCAAGGAGAA 960 

GGGATGCCAA GTTCGGACTT CACTACAGAG TAG gg3 
(277) INFORMATION FOR SEQ ID NO: 276: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 330 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

^5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 276: 

Met Leu Pro Asp Trp Lys Ser Ser Leu He Leu Met Ala Tyr He He 
^ 5 10 15 

He Phe Leu Thr Gly Leu Pro Ala Asn Leu Leu Ala Leu Arg Ala Phe 
20 25 30 

Val Gly Arg He Arg Gin Pro Gin Pro Ala Pro Val His He Leu Leu 
35 40 45 . 

Leu Ser Leu Thr Leu Ala Asp Leu Leu Leu Leu Leu Leu Leu Pro Phe 
50 55 60 

Lys He He Glu Ala Ala Ser Asn Phe Arg Trp Tyr Leu Pro Lys Val 
" ''° 75 80 

Val Cys Ala Leu Thr Ser Phe Gly Phe Tyr Ser Ser He Tyr Cys Ser 
85 90 95 

Thr Trp Leu Leu Ala Gly He Ser He Glu Arg Tyr Leu Gly Val Ala 

105 110 

Phe Pro Val Gin Tyr Lys Leu Ser Arg Arg Pro Leu Tyr Gly Val He 
115 120 125 

Ala Ala Leu Val Ala Trp Val Met Ser Phe Gly His Cys Thr He Val 
130 135 140 

He He Val Gin Tyr Leu Asn Thr Thr Glu Gin Val Arg Ser Gly Asn 

150 155 



20 



30 



35 
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Glu lie Thr Cys Tyr Glu Asn Phe Thr Asp Asn Gin Leu Asp Val Val 
r 165 170 175 

Leu Pro Val Arg Leu Glu Leu Cys Leu Val Leu Phe Phe He Pro Met 
18C 185 190 

5 Ala Val Thr He Phe Cys Tyr Trp Arg Phe Val Trp He Met Leu Ser 

195 200 205 

Gin Pro Leu Val Gly Ala Gin Arg Arg Arg Arg Ala Lys Gly Leu Ala 
210 215 220 

Val Val Thr Leu Leu Asn Phe Leu Val Cys Phe Gly Pro Tyr Asn Val 
225 230 235 240 

Ser His Leu Val Gly Tyr His Gin Arg Lys Ser Pro Trp Trp Arg Ser 
245 250 255 



15 



He Ala Val Val Phe Ser Ser Leu Asn Ala Ser Leu Asp Pro Leu Leu 
260 265 270 

Phe Tyr Phe Ser Ser Ser Val Val Arg Arg Ala Phe Gly Arg Gly Leu 
275 280 285 

Gin Val Leu Arg Asn Gin Gly Ser Ser Leu Leu Gly Arg Arg Gly Lvs 
290 295 300 



Asp Thr Ala Glu Gly Thr Asn Glu Asp Arg Gly Val Gly Gin Gly Glu 

320 



20 305 310 .315 



Gly Met Pro Ser Ser Asp Phe Thr Thr Glu 
325 330 

(278) INFORMATION FOR SEQ ID NO: 277: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 2724 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 277: 

ATGGACACCT CCCGGCTCGG TGTGCTCCTG TCCTTGCCTG TGCTGCTGCA GCTGGCGACC 60 

GGGGGCAGCT CTCCCAGGTC TGGTGTGTTG CTGAGGGGCT GCCCGACACA CTGTCATTGC 120 

GAGCCCGACG GCAGGATGTT GCTCAGGGTG GACTGCTCCG ACCTGGGGCT CTCGGAGCTG 180 

CCTTCCAACC TCAGCGTCTT CACCTCCTAC CTAGACCTCA GTATGAACAA CATCAGTCAG 240 

35 CTGCTCCCGA ATCCCCTGCC CAGTCTCCGC TTCCTGGAGG AGTTACGTCT TGCGGGAAAC 300 
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GCTCTGACAT ACATTCCCAA GGGAGCATTC ACTGGCCTTT ACAGTCTTAA AGTTCTTATG 360 
CTGCAGAATA ATCAGCTAAG ACACGTACCC ACAGAAGCTC TGCy^GAATTT GCGAAGCCTT 420 
CAATCCCTGC GTCTQGATGC TAACCACATC AGCTATGTGC CCCCAAGCTG TTTCAGTGGC 4G0 
CTGCATTCCC TGAGGCACCT GTGGCTGGAT GACAATGCGT TAACAGAAAT CCCCGTCCAG 540 
5 GCTTTTAGAA GTTTATCGGC ATTGCAAGCC ATGACCTTGG CCCTGAACAA AATACACCAC 600 
ATACCAQACT ATGCCTTTGG AAACCTCTCC AGCTTGGTAG TTCTACATCT CCATAACAAT 660 
AGAATCCACT CCCTGGGAAA GAAATGCTTT GATGGGCTCC ACAGCCTAGA GACTTTAGAT 720 
TTAAATTACA ATAACCTTGA TGAATTCCCC ACTGCAATTA 6GACACTCTC CAACCTTAAA 780 
GAACTAGGAT TTCATAGCAA CAATATCAGG TCGATACCTG AGAAAGCATT TGTAGGCAAC 840 
10 CCTTCTCTTA TTACAATACA TTTCTATGAC AATCCCATCC AATTTGTTGG GAGATCTGCT 900 
TTTCAACATT TACCTGAACT AAGAACACTG ACTCTGAATG GTGCCTCACA AATAACTGAA 960 
TTTCCTGATT TAACTGGAAC TGCAAACCTG GAGAGTCTX3A CTTTAACTGG AGCACA6ATC 1020 
TCATCTCTTC CTCAAACCGT CTGCAATCAG TTACCTAATC TCCAAGTGCT AGATCTCTCT X080 
TACAACCTAT TAGAAGATTT ACCCAGTTTT TCAGTCTGCC AAAAGCTTCA GAAAATTGAC 1140 
5 CTAAGACATA ATGAAATCTA CGAAATTAAA GTTGACACTT TCCAGCAGTT GCTTAGCCTC 1200 
CGATCGCTGA ATTTGGCTTG GAACAAAATT GCTATTATTC ACCCCAATGC ATTTTCCACT 1260 
TTGCCATCCC TAATAAAGCT GQACCTATCG TCCAACCTCC TGTCQTCTTT TCCTATAACT 1320 
QGOTTACATG GTTTAACTCA CTTAAAATTA ACAGGAAATC ATGCCTTACA GAGCTTGATA 1380 
TCATCTGAAA ACTTTCCAGA ACTCAAGGTT ATAGAAATGC CTTATGCTTA CCAGTGCTGT 1440 
0 .GCATTTGGAG TGTGTGAGAA TGCCTATAAG ATTTCTAATC AATGGAATAA AGGTGACAAC 1500 
AfiCAGTATGG ACGACCTTCA TAAQAAAGAT GCTGGAATGT TTCAGGCTCA AGATGAACGT 1560 
GACCTTGAAG ATTTCCTGCT TGACTTTGAG GAAGACCTGA AAGCCCTTCA TTCAGTGCAG 1620 
TGTTCACCTT CCCCAGGCCC CTTCAAACCC TGTGAACACC TGCTTGATGG CTGGCTGATC 1680 
AGAATTGGAG TGTGGACCAT AGCAGTTCTG GCACTTACTT GTAATCCTTT GGTGACTTCA 1740 
ACAGTTTTCA GATCCCCTCT GTACATTTCC CCCATTAAAC TGTTAATTGG GGTCATCGCA 1800 
GCAGTGAACA TGCTCACGGG AGTCTCCAGT GCCGTCCTGG CTGGTGTGGA TGCGTTCACT 1860 
TTTGGCAGCT TTGCACGACA TG6TGCCT6Q TGG6AGAATG GGGTTCGTTQ CCATGTCATT 1920 
GGTTTTTTGT CCATTTTTGC TTCA6AATCA TCTGTTTTCC TGCTTACTCT GGCAGCCCTG 1980 
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GAGCGTGGGT TCTCTGTGAA ATATTCTGCA AAATTTGAAA CGAAAGCTCC ATTTTCTAGC 2040 

CTGAAAGTAA TCATTTTGCT CTGTGCCCT6 CTGGCCTTGA CCATGGCCGC AGTTCCCCTG 2100 

CTGGGTGGCA GCAAGTATGG CGCCTCCCCT CTCTGCCTGC CTTTGCCTTT TGGGGAGCCC 2160 

AGCACCATGG GCTACATGGT CGCTCTCATC TTGCTCAATT CCCTTTGCTT CCTCATGATG 2220 

5 ACCATTGCCT ACACCAAGCT CTACTGCAAT TTGGACAAGG GAGACCTGGA GAATATTTGG 2280 

GACTGCTCTA TGAAAAAACA CATTGCCCTG TTGCTCTTCA CCAACTGCAT CCTAAACTGC 2340 

CCTGTGGCTT TCTTGTCCTT CTCCTCTTTA ATAAACCTTA CATTTATCAG TCCTGAAGTA 2400 

ATTAAGTTTA TCCTTCTGGT GGTAGTCCCA CTTCCTGCAT GTCTCAATCC CCTTCTCTAC 2460 

ATCTTGTTCA ATCCTCACTT TAAGGAGGAT CTGGTGAGCC TGAGAAAGCA AACCTACGTC 2520 

10 TGGACAAGAT CAAAACACCC AAGCTTGATG TCAATTAACT CTGATGATGT CGAAAAACAG 2580 

TCCTGTGACT CAACTCAAGC CTTGGTAACC TTTACCAGCT CCAGCATCAC TTATGACCTG 2640 

CCTCCCAGTT CCGTGCCATC ACCAGCTTAT CCAGTGACTG AGAGCTGCCA TCTTTCCTCT 2700 

GTGGCATTTG TCCCATGTCT CTAA 2724 
(279) INFORMATION FOR SEQ ID N0:278: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IjENGTH: 907 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 278: 

Met Asp Thr Ser Arg Leu Gly Val Leu Leu Ser Leu Pro Val Leu Leu 
15 10 15 

Gin Leu Ala Thr Gly Gly Ser Ser Pro Arg Ser Gly Val Leu Leu Arg 
20 25 30 

Gly Cys Pro Thr His Cys His Cys Glu Pro Asp Gly Arg Met Leu Leu 
35 40 45 

;^g Val Asp Cys Ser Asp Leu Gly Leu Ser Glu Leu Pro Ser Asn Leu 
50 55 60 

Ser Val Phe Thr Ser Tyr Leu Asp Leu Ser Met Asn Asn He Ser Gin 
^5 '70 75 80 

Leu Leu Pro Asn Pro Leu Pro Ser Leu Arg Phe Leu Glu Glu Leu Arg 
85 90 95 



wo 00/22129 



PCT/US99/23938 



233 

Leu Ala Gly Asn Ala Leu Thr Tyr He Pro Lys Gly Ala Phe Thr Gly 
100 105 110 

Leu Tyr Ser Leu Lys Val Leu Met Leu Gin Asn Asn Gin Leu Arg His 

120 125 

Val Pro Thr Glu Ala Leu Gin Asn Leu Arg Ser Leu Gin Ser Leu Arg 
130 135 140 

Leu Asp Ala Asn His He Ser Tyr Val Pro Pro Ser Cys Phe Ser Gly 

150 155 

Leu His Ser Leu Arg His Leu Trp Leu Asp Asp Asn Ala Leu Thr Glu 
165 170 

He Pro Val Gin Ala Phe Arg Ser Leu Ser Ala Leu Gin Ala Met Thr 
180 185 190 

Leu Ala Leu Asn Lys He His His He Pro Asp Tyr Ala Phe Gly Asn 
195 200 205 

I 

Leu Ser Ser Leu Val Val Leu His Leu His Asn Asn Arg He His Ser 
210 215 220 

Leu Gly Lys Lys Cys Phe Asp Gly Leu His Ser Leu Glu Thr Leu Asp 
225 230 235 240 

Leu Asn Tyr Asn Asn Leu Asp Glu Phe Pro Thr Ala He Arg Thr Leu 
245 250 255 

Ser Asn Leu Lys Glu Leu Gly Phe His Ser Asn Asn He Arg Ser He 
260 .265 270 

Pro Glu Lys Ala Phe Val Gly Asn Pro Ser Leu He thr He His Phe 
275 280 285 

Tyr Asp Asn Pro He Gin Phe Val Gly Arg Ser Ala Phe Gin His Leu 
290 295 300 

Pro Glu Leu Arg Thr Leu Thr Leu Asn Gly Ala Ser Gin He Thr Glu 
305 -310 -i-ii: 

315 320 

Phe Pro Asp Leu Thr Gly Thr Ala Asn Leu Glu Ser Leu Thr Leu Thr 
325 330 335 

Gly Ala Qln He Ser Ser Leu Pro Gin Thr Val Cys Asn Gin Leu Pro 
340 345 

Asn Leu Gin Val Leu Asp Leu Ser Tyr Asn Leu Leu Glu Asp Leu Pro 
355 360 365 

Ser Phe Ser Val Cys Gin Lys Leu Gin Lys He Asp Leu Arg His Asn 
370 375 380 



Glu He Tyr Glu He Lys Val Asp Thr Phe Qln Gla Leu Leu Ser 



I<eu 
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385 390 395- 400 

Arg Ser Leu Asn Leu Ala Trp Asn Lys lie Ala He He His Pro Asn 
405 410 415 

Ala Phe Ser Thr Leu Pro Ser Leu He Lys Leu Asp Leu Ser Ser Asn 
420 425 430 

Leu Leu Ser Ser Phe Pro He Thr Gly Leu His Gly Leu Thr His Leu 
435 440 445 

Lys Leu Thr Gly Asn His Ala Leu Gin Ser Leu He Ser Ser Glu Asn 
450 455 460 

Phe Pro Glu Leu Lys Val He Glu Met Pro Tyr Ala Tyr Gin Cys Cys 
465 470 475 480 

Ala Phe Gly Val Cys Glu Asn Ala Tyr Lys He Ser Asn Gin Trp Asn 
485 490 495 

Lys Gly Asp Asn Ser Ser Met Asp Asp Leu His Lys Lys Asp Ala Gly 
15 500 505 510 

Met Phe Gin Ala Gin Asp Glu Arg Asp Leu Glu Asp Phe Leu Leu Asp 
515 520 525 

Phe Glu Glu Asp Leu Lys Ala Leu His Ser Val Gin Cys Ser Pro Ser 
530 535 540 



10 



20 



25 



30 



35 



Pro Gly Pro Phe Lys Pro Cys Glu His Leu Leu Asp Gly Trp Leu He 
545 550 555 560 

Arg He Gly Val Trp Thr He Ala Val Leu Ala Leu Thr Cys Asn Ala 
565 570 575 

Leu Val Thr Ser Thr Val Phe Arg Ser Pro Leu Tyr He Ser Pro He 
580 585 590 

Lys Leu Leu He Gly Val He Ala Ala Val Asn Met Leu Thr Gly Val 
595 600 605 

Ser Ser Ala Val Leu Ala Gly Val Asp Ala Phe Thr Phe Gly Ser Phe 
610 615 620 

Ala Arg His Gly Ala Trp Trp Glu Asn Gly Val Gly Cys His Val He 
625 630 635 640 

Gly Phe Leu Ser He Phe Ala Ser Glu Ser Ser Val Phe Leu Leu Thr 
645 650 655 

Leu Ala Ala Leu Glu Arg Gly Phe Ser Val Lys Tyr Ser Ala Lys Phe 
660 665 670 

Glu Thr Lys Ala Pro Phe Ser Ser Leu Lys Val He He Leu Leu Cys 
675 680 685 
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Ala Leu Leu Ala Leu Thr Met Ala Ala Val Pro Leu Leu Gly Gly Ser 
690 695 700 

Lys Tyr Gly Ala Ser Pro Leu Cys Leu Pro Leu Pro Phe Gly Glu Pro 
705 710 715 720 

5 Ser Thr Met Gly Tyr Met Val Ala Leu He Leu Leu Asn Ser Leu Cys 

725 730 735 

Phe Leu Met Met Thr He Ala Tyr Thr Lys Leu Tyr Cys Asn Leu Asp 
740 745 750 

Lys Gly Asp Leu Glu Asn He Trp Asp Cys Ser Met Lys Lys His He 
10 755 760 765 

Ala Leu Leu Leu Phe Thr Asn Cys He Leu Asn Cys Pro Val Ala Phe 
770 775 780 
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Leu Ser Phe. Ser Ser Leu He Asn Leu Thr Phe He Ser Pro Glu Val 
785 790 795 800 

He Lys Phe He Leu Leu Val Val Val Pro Leu Pro Ala Cys Leu Asn 
805 810 815 

Pro Leu Leu Tyr He Leu Phe Asn Pro His Phe Lys Glu Asp Leu Val 
820 825 830 

ser Leu Arg Lys Gin Thr Tyr Val Trp Thr Arg Ser Lys His Pro Ser 
835 840 845 

Leu Met Ser He Asn Ser Asp Asp Val Glu Lys Gin Ser Cys Asp Ser 
850 855 860 

Thr Gin Ala Leu Val Thr Phe Thr Ser Ser Ser He Thr Tyr Asp Leu 
865 870 875 .880 

Pro Pro Ser Ser Val Pro Ser Pro Ala Tyr Pro Val Thr Glu Ser Cys 
885 890 

His Leu Ser Ser Val Ala Phe Val Pro Cys Leu 
900 905 

(280) INFORMATION FOR SEQ ID NO: 279: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 279: 

CATGCCAACC GGCCCGGGAG GCTGCTGCTG GT 



895 



32 
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(281) INFORMATION FOR SEQ ID NO: 280: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 280: 



ACCAGCAGCA GCCTCGC6GG CCGGTTGGCA TG 
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